
Calhoun 

iniQiuiic^iul Ar{hiv« of tilt Mil vdl Poii^roduiit School 


Calhoun: The NPS Institutional Archive 
□Space Repository 



Theses and Dissertations 


1. Thesis and Dissertation Collection, all items 


1993-09 

A porting methodology for parallel database systems 

Watkins, Stanley Hugh. 

Monterey, California. Naval Postgraduate School 


http://hdl.handle.net/10945/40011 


This publication is a work of the U.S. Government as defined in Title 17, United 
States Code, Section 101. Copyright protection is not available for this work in the 
United States. 

Downloaded from NPS Archive: Calhoun 



DUDLEY 

KNOX 

LIBRARY 


htt p://w ww. n ps. e du/l ib ra ry 


Caflwuo is the Naval Postgraduate School's public access digital repository for 
research mate rials and institutiional publicatkins created by the NPS community. 
Calhoun is named for Professor of Mathematics Guy K. Caftiouo, NPS's first 
appointed — and published — schoteily author. 

Dudley Knox Library / Naval Postgraduate School 
411 Dyer Road / 1 Univefsity Circle 
Monterey, California USA 93943 






NAVAL POSTGRADUATE SCHOOL 
Monterey, California 

tOmm 

ms 

cm5 

r^S 

CMK 

^3 SHI 

I 

Q 

< 


LZ- 

THESIS 



Approved for public release; distribution is unlimited. 


94 2 01 ■^104 







SCCUWTV OASSfflCATION OF THIS FACE 




REPORT DOCUMENTATION PAGE 


UNCLASSIFIED 







Approved for public release; 
distribution is unlimited 




puter science uept 
Naval Postgraduate School 


6c. ADDRESS (City, Statt, andZtP Cod») 

Monterey, CA 93943-5000 




Naval Postgraduate School 


7b. ADDRESS (City, State, and ZIP Code) 

Monterey, CA 93943-5000 



8c. ADDRESS (City, State, and ZIP Code) 


11. TITLE (Mode Security Classification) 

A Porting Methodology for Parallel Database Systems (U) 




lV/*^:K€llk^lki 


ELEMENT NO 

NO. 

NO. 

ACCESSION NO. 








•izii 


I Ik! nk 1 1 I 


I ikH:/ 


, ^ ^_ . 14 DATE OF REPORT (Year. Month. Day) . 

FROM 12/92 TO 09/93 I September 1993 I 97 _ 


e views expressed in this thesis are those ot the author ana do not retiect tne ottici 
policy or position of the Department of Defense or the United States Government 




COSATI COOES 


GROUP 


SUB-GROUP 


16. SUBJECT TERMS (Continue on reverse if necessary and identify by block number) 

Parallel Database, Multilingual and Multimodal Database, MultiBackend 
Database Computer, Poning, MDBS, Heterogenous Database System 


19. ABSTRACT (Continue on reverse if necessary and identify by block number) 

The Multibackend Database Supercomputer (MDBS) pioneered in the Naval Postgraduate School Laboratory 
for Database Systems Research offers an elegant solution to the four most pressing problems associated with the tra¬ 
ditional approach to very large database management systems; capacity growth, performance improvement, data shar¬ 
ing, and resource consolidation. The purpose of this thesis is to develop a theory of system software portability for 
this large and complex network application which will facilitate others in the installation and utilization of MDBS. 

The first challenge is the almost total lack of documentation about MDBS software of use to system porters. A 
second set of issues revolves around the use of hardware by MDBS, particularly the use of mass storage devices for 
the storage and manipulation of base- and meta-data. A third challenge concerns the portability of system calls, shell 
programs, and the C language implementation. A final set of ponability issues arises from the extensive use of inter¬ 
process and inter-machine communications by MDBS. 


»1 I:! I ’-1*11 [•12 7A7Ji -11II I’I«] JLI I iLlHI 


UNCLASSIFIEDAJNLIMITEO □SAMEASRPT. 00110 USERS | UNCLASSIFIED 


liii’Myni 


DD FORM 1473,84 MAR 





83 APR edition may be used untfl exhausted 
All other editions are obsolete 


Include Area Code) 


SECURITY CUSSIFICATION OF THIS PAGE 

UNCLASSIFIED 







































UNCLASSIFIED 

SECUHITV CIASSIF>C*TI0N Of THIS PAGE _ 

^^vJWap^Sacirt^!w^fqjccnnvonc?1?Knmal'yzm^!I^3o!?mcnSone^ortaBnuyTssues7Hu^naIywnnc^ 

by porting the most advanced version of MDBS soft«vare to a different platform supported by different hardware and 
operating system software. 

This thesis imivides a frameworit in which to understand and assess specific portability concerns about MDBS. 
We descnbe the original routines for accessing the mass storage devices and explain why it was necessary to modify 
them for portability. We identify and discuss other hardware-specific information coded into MDBS. We identify and 
correct problems related to the recompilation of the MDBS code on the new platform. We provide a detailed analysis 
of the requirements for and the implementation of inter-process and inter-machine communications for MDBS. In ad¬ 
dition, we expand system debugging features, improve documentation, provide a new demonstration database, and of¬ 
fer advice for future porters of MDBS. 


ii 


SECURITY CLASSIFICATION Of THIS PAGE 

UNCLASSIFIED 











Approved for public release; distribution is unlimited 


A Porting Methodology for Parallel Database Systems 


by 

Stanley Hugh Watkins 
Major, United States Marine Corps 

Bachelor of Science (Political Science). United States Naval Academy. 1980 


Submitted in partial fulfillment of the 
requirements for the degree of 


MASTER OF COMPUTER SCIENCE 


from the 


Author: 


Approved By: 


NAVAL POSTGRADUATE SCHOOL 


September 1993 



Itanley Hugh Watkins 


I 


Dr. David K. Hsiao, Thesis Advisor 



Department of Computer Science tab 


VDanDounced 
Justif ioetlo: 



□ □ 













ABSTRACT 


The Multibackend Database Supercomputer (MDBS) pioneered in the Naval Postgrad¬ 
uate School Laboratory for Database Systems Research offers an elegant solution to the 
four most pressing problems associated with the traditional approach to very large database 
management systems; capacity growth, performance improvement, data sharing, and re¬ 
source consolidation. The purpose of this thesis is to develop a theory of system software 
portability for this large and complex network plication which will facilitate others in the 
installation and utilization of MDBS. 

The first challenge is the almost total lack of documentation about MDBS software 
of use to system porters. A second set of issues revolves around the use of hardware by 
MDBS, particularly the use of mass storage devices for the storage and manipulation of 
base- and meta-data. A third challenge concerns the portability of system calls, shell pro¬ 
grams, and the C language implementation. A final set of portability issues arises from the 
extensive use of inter-process and inter-machine communications by MDBS. 

Our approach to this project involves first analyzing the aforementioned portability is¬ 
sues. This analysis is tested by porting the most advanced version of MDBS software to a 
different platform supported by different hardware and operating system software. 

This thesis provides a fitunework in which to understand and assess specific portabil¬ 
ity concerns about MDBS. We describe the original routines for accessing the mass storage 
devices and explain why it was necessary to nKxlify them for portability. We identify and 
discuss other hardware-specific information coded into MDBS. We identify and correct 
problems related to the recompilation of the MDBS code on the new platform. We provide 
a detailed analysis of die requirements for and the implementation of intCT-process and in¬ 
ter-machine communications for MDBS. In addition, we expand system debugging fea¬ 
tures, improve documentation, provide a new demonstration database, and offer advice for 
future porters of MDBS. 


iv 










ACKNOWLEDGEMENTS 

I would like to thank Dr. David K. Hsiao for his insight and advice in all database 
matters and for his assistance reviewing this thesis. Without his knowledge and expertise, 
the Multibackend Database System project and this thesis would not be possible. 

I would also like to thank the support staff at the Naval Postgraduate School for their 
help. Especially, I would like to thank John Locke and Mike Williams for their assistance 
with MDBS software and hardware. 

Finally, I would like to thank my wife, Stella, for her patience. 








TABLE OF CONTENTS 


l. THE INTRODUCTION.1 

A. THE BACKGROUND FOR THIS THESIS.1 

B. THE AIM OF THIS THESIS.6 

C THE SETTING OF THIS THESIS.6 

D. THE ORGANIZATION OF THIS THESIS.7 

D. THE APPROACH.9 

A. AN INTRODUCTION.9 

B. THE TASK.9 

C. GUIDING CONSIDERATIONS.10 

D. THEINITIALPLAN.11 

E. MODIFICATIONS TO THE PLAN.13 

m. SOFTWARE ISSUES.15 

A. AN INTRODUCTION.15 

B. OPERATING SYSTEM CONSIDERATIONS.15 

1. System Call Interface to the Kernel.15 

2. Differences in the Shell.17 

3. Differences in Stack Implementation.18 

4. Path Name and File Name Considerations.19 

C C LANGUAGE COMPILER & LINKER CONSIDERATIONS.19 

1. The C Language Library.19 

2. Type Conversion.20 

3. Error-Checking.21 

4. The Linker and the Makefiles.22 

D. IN CONCLUSION.22 

IV. HARDWARE ISSUES.24 

A. AN INTRODUCTION.24 

B. FDCED DISK ISSUES FOR MDBS.24 

1. Fixed Disk Requirements.24 

2. Understanding MDBS Disk Utilization.25 

3. Implementation Details of the Meta Disk. 26 

4. Implementation Details of the Base-Data Disk.27 

5. Disk Initialization.29 

6 . The Problem and the Solution.30 

7. Hard-Coded Disk Information.33 

C. OTHER HARDWARE SPECIFIC ISSUES FOR MDBS.34 

1. Workstation Information. 34 

2. Network Communication Information.36 

D. IN CONCLUSION.37 

V. COMMUNICATIONS ISSUES.38 

A. AN INTRODUCTION.38 


vi 














































B. MDBS COMMUNICATIONS REQUIREMENTS 

AND IMPLEMENTATION.38 

1. Inter-Process Communications.38 

2. Inter-Machine Communications.39 

C. COMMUNICATION CHANNEL DESIGN AND FUNCTION.40 

1. Process Functions.40 

2. Design of the Communications Qtannels.41 

3. How MDBS Establishes Communications.41 

D. DETAILS IMPORTANT TO PORTERS OF MDBS.44 

1. Limitations on Message Lengths.44 

2. Access to the Network.44 

3. Socket Definitions.45 

4. The Socket Directory.45 

5. A New Aid for Timing Messages.46 

E. IN CONCLUSION.46 

VI. MISCELLANEOUS OTHER ISSUES.47 

A. AN INTRODUCTION.47 

B. MDBS DIRECTORY INFORMATION.47 

C. ADDITION/DELETION OF LANGUAGE INTERFACE MODULES.48 

VII. CONCLUSIONS.49 

A. AN INTRODUCTION.49 

B. WHAT HAS BEEN ACCOMPLISHED.49 

C. TOPICS FOR FURTHER RESEARCH.49 

D. IMPLEMENTATION ISSUES.50 

E. RECOMMENDATIONS.51 

APPENDDC A. SYSTEM COMPARISON.53 

A. HARDWARE.53 

1. Old hardware.53 

a. Controller: ISI-V model V24S workstation.53 

b. Backend: ISl-V model V24S workstation. 53 

2. New Hardware.54 

a. Controller: Sun model 4/110 workstation.54 

b. Backend: Sun model 4/280 workstation.55 

B. SOFTWARE.55 

a. Old software.55 

b. New software.56 

APPENDIX B. AIDS TO DEBUGGING.57 

A. AN INTRODUCTION.57 

B. USING DEBUGGING FLAGS.57 

C. USING TRACE FILES.57 

D. CHECKING SYSTEM STARTUP.58 

E. MAINTAINING A HISTORY.59 

F. UNIX DEBUGGING TOOLS.59 


vu 
















































G. PROGRAMMING TIPS.59 

APPENDIX C CONTROLLER DIRECTORY AND FE^ STRUCTURE.61 

A. THE “mdbs” DIRECTORY.61 

B. THE “Sockets” DIRECTORY.61 

C THE “UscrFilcs” DIRECTORY.62 

D. THE “Wn” DIRECTORY.63 

E. THE “version" DIRECTORY.63 

F. THE “version/bin" DIRECTORY.64 

G. THE “run” DIRECTORY.65 

H. THE “version/COMMON” DIRECTORY.66 

I. THE “BE” DIRECTORY.67 

J. THE “BE/COMMON” DIRECTORY.68 

K. THE “BE/BCOM” DIRECTORY.68 

L. THE “BEAX” DIRECTORY.69 

M. THE “BE/DIO” DIRECTORY.69 

N. THE “BE/DM” DIRECTORY.70 

O. THE “BE/RECP” DIRECTORY.71 

P. THE “version/CNTRL” DIRECTORY.72 

Q. THE “CNTRL/COMMON” DIRECTORY.72 

R. THE “CNTRL/CCOM” DIRECTORY.72 

S. THE “CNTRL/DG” DIRECTORY.73 

T. THE “CNTRL/PP” DIRECTORY.73 

U. THE “CNTRUREQP* DIRECTORY.74 

V. THE “CNTRL/n” DIRECTORY.74 

W. THE “CNTRL/n/LangIF’DIRECTORY.75 

APPENDK D. BACKEND DIRECTORY AND FILE INFORMATION .77 

A. THE “AT’ ROOT DIRECTORY..77 

B. THE “Sockets” DIRECTORY.77 

C THE ‘UserFUes” DIRY.78 

D. THE “be.version” DIRECTORY.78 

E. THE “bin” DIRECTORY.79 

APPENDDC E. DEMONSTRATION DATABASE.80 

A. THE DESCRIPTOR FILE (AMMO.d).80 

B. THE TEMPLATE FILE (AMMO.t).80 

C THE RECORD FILE (AMMOj).81 

D. THE QUERY FILE (AMMOreql).81 

APPENDK F. NEW MDBS FUNCTIONS.82 

A. THE "host_naine_integer" FUNCTION.82 

B. THE "init_ineta_NATT' FUNCTION.82 

LIST OF REFERENCES.84 

INITIAL DISTRIBUTION LIST.86 


vui 















































LIST OF FIGURES 


Figure 1 : The Muldbackend Database Supercomputer. 3 

Figure 2 ; Multimodel, Multilingual, and Cross-Model Access Capabilities of MDBS. 5 

Figure 3 : The Original MDBS Configuration.11 

Figure 4: Thelntermediate MDBS Configuration. 12 

Figure 5 : The Final MDBS Configuration.12 

Figure 6 : Meta-Data Disk Storage Format.27 

Figure 7 : Record Distribution on a Base Data Disk Drive.28 

Figure 8 : Modified Meta-Data Disk Storage Format.31 

Figure 9; MDBS Communication Qiannels.42 


ix 














L THE INTRODUCTION 


A. THE BACKGROUND FOR THIS THESIS 

Today, governments and corporations are struggling to come to grips with the 
explosive growth of information processing requirements. The Departnwnt of Defense 
alone spends over nine billion dollars annually and is highly dependent on general-purpose 
data processing hardware, software and services [United States House of Representatives, 
1989, p. 1]. Database systems are an important part of the information processing 
challenge. Unfortunately, the performance of database systems have not kept pace with the 
technical advances in computer industry as a whole [Elmasri, 1989, p. 637]. 

The Office of Naval Research (ONR) has identified two interoperable information 
systems technology issues. The first of these is data sharing. Data sharing (or data 
exchange) involves the ability to transparently access other user’s databases. Traditional 
heterogenous database management systems (dbms) do not support access by users 
employing other data models and languages. For example, IBM’s relational dbms called 
DB2, cannot access non-relational databases. Instead, IBM replicates non-relational data in 
the heterogenous form in separate non-relational dbms. Any replication of data at different 
locations or in different data models is not an answer, since this introduces data integrity 
problems and increases storage requirements. The second technology issue identified by 
ONR is that of resource consolidation. Resource consolidation refers to more efficient use 
of database hardware, software, and support personnel by consolidating them into one 
computing environment The proliferation of stand-alone (homogenous/heterogenous) 
dbms represents needless duplication in a time of shrinking budgets. 

A third database technology issue is processing speed. General-purpose, von Neuman 
type computers are not optimized for the tasks performed upon databases. Large databases 
are maintained in the secondary storage (e.g., disks). Advances in the speed of secondary- 
storage devices have not kept pace with advances in the speed of central processing units. 


1 






This results in processing (telays on the ever-increasingly large amounts of infcmnation 
stOTed in the secondary storage [Elmasri, 1989, p. 638]. 

A fourth database technology issue is capacity growth. As organizations become 
larger and more sophisticated, their information processing needs grow. This is not just the 
problem of data storage. More significant is the ne«l to process the increasing amounts of 
data. The traditional solution to this problem has been the acquisition of larger, more 
powerful machines. The restrictions on budget growth and the long lead-time associated 
with the acquisition of new hardware make this an inq>ractical approach. 

Therefore, a new approach is needed to deal with these four technology issues. Such 
alternatives are generically referred to as database machines [Elmasri, 1989, p. 638]. One 
database machine which elegantly addresses all four of these issues is the subject of a 
continuing project in the NFS Laboratory for Database Systems Research. The project 
focuses on the Multibackend Database Supercomputer (MDBS) which is designed to run 
on standard, off-the-shelf hardware and identical system software consisting of networked 
UNIX workstations. A single general-purpose workstation serves as a controller. Multiple, 
identical worksutions with their respective high capacity drives for the storage of base and 
meta data serve as backends. The controller receives queries from the users and broadcasts 
them to the backends which then return the results to the controller for post processing and 
routing to the users. 

Base data of any given database are clustered. The clustered data are placed across 
many backend disks. They are configured in a loosely-coupled, parallel architecture which 
leads to parallel accesses to the base data. MDBS is also scalable, since an increase in 
performance can be achieved by attaching additional parallel backends to share the 
workload. MDBS is unique in that the response-time reduction is in direct proportion to the 
number of backends employed [Hsiao, 1991, p. 44 and Hall, 1989]. Figure 1 illustrates the 
hardware architecture of MDBS. This is a highly scalable architecture whose performance 
can be tailored to the needs of the user. 


2 





Meta data disk 



Paging disk 


Meta data disk 



Base data disks 



Paging disk 



Figure I. The Multibackend Database Supercomputer 















The MDBS also answers ONR’s concerns about data sharing among users employing different 
data models and data languages. MDBS eirqploys a single “kernel” data model and data language. The 
capability to interface with users working in other data models and languages is provided for through 
the addition of language interface software modules with schema transformers and transaction 
translators [Hsiao and Kamel, 1989, p. 45]. This use of a multiple-data-models-and-languages-to- 
single-data- model-and-language (“many-to-one”) mapping avoids the integrity pitfalls of data 
replication and permits global concurrency control. Language interfaces for the hierarchical, network, 
relational, and object-oriented models have been successfully integrated into MDBS. For more 
information on the construction of language interfaces, see [Bourgeois, 1993, p. 29]. A functional/ 
DAI^EX model interface is under construction. Most exciting is the cross-model accessing capability 
provided by the transformers and translators which allows, for instance, a relational user to access and 
perform operations upon a database created by a hierarchical user [Zawis, 1987, p. 30 - 74]. This cross¬ 
model access is transparent to the user, meaning that no retraining of the relational user in the 
hierarchical data manipulation language is required. Figure 2 represents the multimodel, multilingual, 
and cross-model accessing capabilities of MDBS. The Multibackend Database Supercomputer 
provides great flexibility in answering the problem of data sharing. 

Finally, the Multibackend Database Supercomputer also addresses the problem of resource 
consolidation. The multi-model and multi-language capabilities of MDBS obsolete the plethora of 
stand-alone heterogenous database management systems currently in use in the DOD community. The 
performance of MDBS can be tailored to suit the needs of the user. A single database numagement 
system with a single scalable architecture, MDBS replaces many separate database management 
systems with different architectures. Support personnel only need to know and to support one system 
<i.e., MDBS). The result is a desirable consolidation and standardization of resources. 

Because it offers practical solutions to the problems of data sharing, resource consolidation, 
capacity growth and performance gain, we believe the Multibackend Database Supercomputer points 
the way to the future for users of very large heterogenous databases. 


4 







A kernel 

A hierarchical 

An object-oriented 

database user 

database user 

database user 



1 



1 

A network 

1 

A relational 

database user 

database user 


Figure 2. Multimodel, Multilingual, and Cross-Model Access Capabilities of MDBS 


5 



















B. THE AIM OF THIS THESIS 

As described above, MDBS is a very promising approach to the problems facing large 
database users. Only off-the-shelf hardware and identical system software are used. This 
supports a claim that MDBS is not hardware dependent and should be relatively easy to 
migrate to a new hardware platform. Indeed, an earlier version of MDBS was ported from 
its original platform consisting of one VAX-11/780 and two n>P-11/44 computers miming 
VMS and RSX operating systems to its current platform consisting of seven ISI 
workstations ranning Berkeley UNIX operating system[Wong, 1986, p. 7-8]. The aim of 
this thesis is to migrate MDBS from its current platform to a newer rise-based platform 
ranning the SunOS operating system to develop a theory of system software portability for 
MDBS. This will become important as the interest in porting MDBS to other platforms 
grows. This theory of system software portability will address hardware issues, language 
issues, communication issues, and other issues important to any future porting of MDBS. 
It is hoped that this work will ease the job of moving MDBS to other platforms and spur 
more interest in MDBS. 

C. THE SETTING OF THIS THESIS 

The Multibackend Database Supercomputer is located in the Laboratory for Database 
Systems Research at the Naval Postgraduate School. The current hardware consists of 
seven ISI workstations based on the Motorola 68020 processor. All ISI workstations are 
using the Berkeley 4.3BSD operating system. Hie new hardware consists of a single Sun 4 
Model 100 with two hard disk drives for a controller and two Sun 4 Model 280s each with 
three hard disk drives for backends. All three Sun machines operate under SunOS 4.1.1. 
The inter-machine-communications link remains a standard ethemet cable. For a detailed 
description of the old and new platforms, see Appendix A. 

The immediate motivations for making this move are threefold. The first motivation 
is reliability. The normal lifetime of computer hardware is five years: The current ISI 


6 







workstations are now seven years old. The support personnel are experiencing difAculty in 
keeping these machines operational. 

The second motivation is the promise of increased hardware performance. My thesis 
advisor has estimated that the overall performance increase will be about 25%. The 
increased performance is important to both MDBS operations and the associated activities 
of researcher. For some applications, such as le-compiling large sections of the code, the 
new hardware will offer a welcome increase in productive time. 

The third motivation is provided by the opportunity to utilize the better software 
available to users of SunOS on the Sparc architecture. The new platform will more easily 
support a much wider range of software useful to the researchers. This includes immediate 
X-Windows support plus access to better compilers and other tools. 

D. THE ORGANIZATION OF THIS THESIS 

This thesis is organized into seven chapters. Chapter I consists of this introduction. In 
Qiapter 11, we outline the approach taken to accomplish the porting and suggests ways in 
which this could have been improved. In Chapter III, we consider software issues important 
to the migration. In Chapter IV, we discuss the specific hardware issues in porting MDBS 
and describe the changes made in moving MDBS to die new platform. MDBS is 
communications'intensive, and in Chapter V we are concerned with these communications 
issues. In Chapter VI, a collection of other, smaller issues encountered in porting MDBS 
are presented. Finally, in Chapter Vn, we conclude the thesis and outline problem areas and 
related issues requiring further study. 

The chapters are supported by five appendices. These five appendices contain 
additional technical details, observations, and segments of code. In Appendix A, we 
provide a detailed comparison of the old and new platforms. In Appendix B, we provide 
informadon on the use of debugging flags, makefiles, and helpful scripts associated with 
MDBS. In Appendix C, a description of the structure and a listing of the files associated 
with the controller are given. In Appendix D, a similar listing for a standard backend is 


7 





r 


given. In Appendix E, the details about the sample database created to demonstrate the 
functioning of MDBS on die new platform are given. In Appendix F, the code for the new 
functions added to MDBS is provided. 


I 


8 









II. THE APPROACH 


A. AN INTRODUCTION 

In a sizable project such as the porting of a large UNIX application, the impoitance of 
an organized ^roach to the porting task cannot be overemphasized. The software of 
Multibackend Database Supercorr^uter is both large and complex. MDBS consists of 
approximately 527 separate files distributed over 68 subdirectories with an aggregate size 
of about 23 megabytes (source code, object code, executables, scripts, and temporary files). 
The creation of the twelve processes (six in the controller, six in each backend) required to 
run MDBS is controlled by forty makefiles at different locations in the controller code. 
Each makefile contains information and instructions for compiling and linking the 
executable programs. On the top of the size and scope issues, there are issues of hardware 
delivery schedules and the interdependent time-tables of co-workers. This chapter presents 
the evolution of the porting approach utilized to port MDBS to Sun-4 workstations and 
concludes with some recommendations for future porting projects. 

B. THE TASK 

The successful completion of this project involves changing both hardware and 
software. The old ISl-V workstations, based on the Motorola 68020 processor and operated 
under the Berkeley UNIX (4.3 BSD), are to be replaced by Sun workstations, based on the 
Sparc architecture and operated under Sun UNIX SunOS 4.1.1). It is significant that 
the same Ethernet communications bus is to be used for the new system. A detailed 
description of the platforms involved may be found in Appendix A. The workstation 
hardware and operating-system software preparations are the responsibility of the support 
staff. Since MDBS requires only off-the-shelf hardware and system software, their work is 
not detailed here. This thesis instead concentrates on what has been done to move and adapt 
the MDBS software to the new platform. Sinq>ly stated, this involves relocating the code, 
modifying it to function as originally intended, and recompiling all the executables. To 
properly modify the student-and-researcher-written software is the heart of the problem. 


9 






C. GUIDING CONSIDERATIONS 


Three considerations shaped the work on this project First the time is of the essence. 
Only a limited amount of time is available for the completion of the porting. Other 
researches desire to use the enhanced capalnlities of the new system. This conuderation 
has limited modifications to those required to get MDBS up and operating on the new 
platfimn. Non-essential things which require in^ovement are documented, but such 
imfnovements must wait until later. A list of leconrunendations for further work may be 
found in Ch^to’ VO. A listing of the two completely new functions added to MDBS during 
this porting project may be found in Appendix F. 

The second consideration influencing the iq>proach to this project has been the 
uncertainty as to the delivery date of the new hardware. The transfer of the desired 
hardware is contingent upon other acquisitions, and the likelihood of a delay has been teal. 
This consideration has resulted in a conscious effort to emphasize the advanced preparatory 
work which could be accomplished [»ior to the delivery of tire new hardware. The 
preparatory work consisted of a thorough analysis of the structure and functioning of 
MDBS, a study of the similarities and differences between the old and new operating 
systems, and familiarization and experimentation with the new system’s compiler. This 
advanced preparatory work has paid off handsomely, resulting later in a relatively smooth 
implementation with a minimum of unforeseen difficulties. 

The final consideration influencing the approach to this project has been the 
requirement to minimize the disruption to other researchers working on MDBS. Three 
other projects have been conducting research concurrently with the porting project An 
^rproach is needed which could utilize the existing conununication network during the 
preparation of the new system without stopping the work of the other teams on the old 
^stem. 

These three considerations have influenced the porting plan which is described in the 
next section. 


10 







D. THE INITIAL PLAN 

The initial plan was finned iq) on Febniary 23, 1993. It calls for a four-month 
preparauvy period, followed by three months for the modification and one month for the 
testing and evaluation. The preparatory period involves the comparative study of both 
operating systems' implementation of data types, system calls, and conununications. 
Differences between the old and new compilers, especially with regard to function libraries, 
are examined. This time also allows the selection of a specific version of the MDBS 
software to be ported (i.e. the “greg” version has been selected as the most advanced 
multibackend version). It is necessary to identify implementation dependent information, 
such as host names or data directory names, which are hard-coded into the software. The 
preparatory period is also used to prepare “clean" paper models of a backend and a 
controller free of the extraneous files left over from prior versions of the software. The 
preparatory time also allows an analysis of the new hardware (eq>ecially, the new, frxed 
disk drives) to assess any impact on the database system software. It is hoped that most 
every required change could be identified at this early stage. 

The porting is to be accomplished in three phases to minimize the disruption to other 
researchers. Phase one, the preparatory work, is supported by the original MDBS 
configuration. Note that backends dbS, db6, and db9 are crossed out or omitted in Figures 



11 










3 and 4, since they were not functional. The intermediate configuration of phase two would 
be achieved by simply adding the three new workstations (one new controllo^, two new 
backends) to existing connections on the network. MDBS will support any number of 


backends on the network. It also allows multiple controllers to be physically attached to the 






Old 

Controller 






dbS 

isiv 








db net 

_r_i_L_ 

1 

1 

1 

1 


1 

dbl2 
Sun 4 

dbl3 
Sun 4 

db3 

isiv 

db4 

isiv 

dbll 
Sun 4 

m 

db7 

isiv 


1 ^ 




New 

Controller 





Figure 4. The Intermediate MDBS ConHguration 


network as long as only one of them is operating any of the six controller processes at a 
time (this is a function of the inter-machine communications design and port assignments). 
Users of die old system may continue to operas normally. The movement of files using the 
network remote copy command (rep) and p<»rting related runs of the new system may be 
executed whenever the old system is not in use. This arrangement maximized productive 
use of both systems. (3nce the porting work is completed and the new system is ready for 
use, the old workstations may be removed from the network. Note that the position of the 


dbl2 

dbl3 

Sun 4 

Sun 4 


Controller 

dbll 
Sun 4 


dbnet 


Figure 5. The Hnal MDBS Configuration 


12 
















new controller is changed in Figure 5: This is because the primary contoUer doubles as a 
communications gateway to another, unrelated network at this site. 

The initial plan also focuses the initial woric on the backend woritstations, rather than 
the controller. This is due to the fact that the backends are much simpler and because the 
schedule called for the delivery of the backend hardware before the controller hardware. 

E. MODIFICATIONS TO THE PLAN 

The initial plan supports much of the ground work required before the actual porting 
can be started. The effort put into this analysis was time well spent Like all Hrst efforts, 
though, refinement was necessary once the actual work was commenced. A significant 
change involved the decision to develop the backends first Backend-Hrst development is 
possible, but not practical, because of the way MDBS compiles and distributes code. 
MDBS is designed so that all of the source code for both the controller and the backends 
are stored on the controller under the mdbs/VERSlON/CNTRL and mdbs/VERSlON/BE 
direaories. A system of forty inteirelated makefiles, also on the controller (see the file 
listing in Appendix C), handles the significant task of compiling the twelve processes 
required to run MDBS. Those six processes pertaining to each backend may then be 
manually copied to the mdbs/be.VERSION directory of each backend workstation or 
distributed automatically using the user interface documented in [Meeks, 1993, p. 26 • 27]. 
The small number of shell scripts and related files required for a backend can be copied 
from the old backend or from tape. See Appendix D for a listing of all fries associated with 
a backend. Given the existing arrangement, it is more practical to begin the porting work 
on the controller and use its built-in capability to produce the backend executable fries. 

Another change to the initial plan was the length of time allocated to troubleshooting. 
The initial estimates were overoptimistic by a factor of two. This situation arose primarily 
because of the length of time required to address problems resulting from undocumented 
differences between the compilers, differing implementations of the shell programming 
languages, and the degree to which implementation dependent progranuning techniques 


13 









were used in MDBS. These difficulties are documented in the ch^ters which follow. 
Additional fsu:ts and recommendations thought to be useful to the system porter are 
contained in Appeiklix B. 


14 




IIL SOFTWARE ISSUES 


A AN INTRODUCTION 

The software aq)ects of portability are defined in terms of the programming constructs 
employed and the capabilities of the target tolerating system and its associated cono>iler and 
library. Early incarnations of the Multibackend Database Supercomputer software existed 
on machines running the VMS and RSX operating systems. More modem versions have 
been built on 4.2BSD UNDC and, most recently, 4.3BSD UNIX. It should be possible to 
port MDBS to many operating systems supporting process control, reliable inter-process- 
communication, tvoadcast communication, and a suitable compiler. For the purposes of 
this porting project, MDBS will be moved to hardware running the SunOS UNIX operating 
system version 4.1.1. These operating systems are very similar in that SunOS is a derivative 
of BSD UNIX [Que, 1990, p. 18]. 

Developed prior to the creation of the ANSI standard for the C programming 
language, MDBS software is written in Kemighan and Ritchie standard C. The compiler 
used to create the executable files is the standard compiler (cc) included with the operating 
system. The use of implementation dependent programming constructs greatly increases 
the difficulty of any porting project Fortunately, portability has long been a design 
consideration for MDBS [Wong, 1986, p. 9], and examples of non-portable code are 
infrequent 

In this chapter, we will discuss important issues and relevant differences between the 
original and target operating systems and between the old and new compilers. Issues related 
to communications will be discussed in Qiapter V. 

B. OPERATING SYSTEM CONSIDERATIONS 

1. System Call Interface to the Kemd 

In the UNIX environment an triplication, such as MDBS, interacts with the 
hardware through a set of approximately one hundred system calls [Rosen and Rosinski and 


15 







Farber, 1990, p. 10]. These system calls instruct the kernel to perform various usks, such 
as file I/O or process execution. Hie use of system calls by MDBS was the first operating 
system issue investigated as a part of this porting project. 

The fewer and the more basic the system calls used by a UNIX application, the 
greater its portability [Rochkind, 1985, p. 16]. Only eighteen system calls are used to 
construct MDBS’s higher level functions. These are listed alphabetically below, along with 
their location in the code. 


Table 1; SYSTEM CALLS MADE BY MDBS 


System Call 

Purpose [ISIV, 1986] 

Location 

accept 

accept a connection on a socket 

pcl.c, sndrcv.c 

bind 

bind a name to a socket 

ack.c, pcl.c, sndrcv.c 

close 

delete a descriptor (file or socket) 

many places 

connect 

initiate a socket connection 

pcl.c, sndrcv.c 

exit 

terminate a process 

many places 

gethostname 

get the name of current host 

bgetc, bputc,cgetc, cputc 
dbl.c 

getnetbyname 

get access to the network 

pcl.c 

getpid 

get a process identification number 

generals.c 

gettimeofday 

get the date and time 

generals.c 

kill 

send signal to a process 

shell scripts 

listen 

listen for connection on a socket 

pcl.c, sndrcv.c 

Iseek 

move die read/write points 

cpcountc, dio.c, dicp.c, 
rectag.c, zero.c 

open 

open a file for reading or writing 

many places 

read 

read input (files or sockets) 

cpcountc, dio.c, disp.c, iig.c, 
meta.c, pcl.c, rectag.c, sndrcv.c 

send 

send a message firom a socket 

ack.c, cb.c, sndrcv.c, others 


16 





















































Table 1: SYSTEM CALLS MADE BY MDBS 


System Call 

Purpose [ISrV. 1986] 

Location 

socket 

create an endpoint for communica¬ 
tion 

ack.c, pcl.c, sndrcv.c 

unlink 

remove directory entry (file or 
socket) 

sndrcv.c, gsmodsetc 

write 

write output (file or socket) 

bes.c, epeounte, dio.c, iigd- 
bl.c, meta.c, pcl.c, rectag.c, 
sndrcv.c 


We experienced only one problem as a result of differences between the 
implementation of system calls on the old and new operating system. The singular 
difference was the inability of the IseekO call to return the size of a raw device opened as 
a character special file. This technique was employed to return the disk size in the original 
version of zero.c. The information is passed to the user as an advisory. It is not critical to 
the zeroing function and was commented out of the new version. 

2. Differences in the Shell 

In UNIX, the shell is a command interpreter program (and programming 
language) that serves as an interface between the user and the operating system. The shell 
receives commands and arranges to have them executed. The shell scripts, or interpreter 
files (startcntrl, run.be, stop.db*, zero.db*, etc.), supporting MDBS are designed to run on 
the C shell, a very common replacement for the original (Bourne) UNIX shell. MDBS 
needs a shell that supports job control since it must be able to specify that certain processes 
are run in the background. The C shell universally provides this support [Stevens, 1992, p. 
248 - 249). The SunOS provides the C shell, but some differences in its implementation 
were noted. The scripts governing the start-up of controller processes (start.cntrl), the start¬ 
up of backend processes (run.be), and the halting of running MDBS processes (stop.cmd) 


17 

















required modification to avoid syntax errors on the new system. The specific modifications 
are detailed in Appendix B. 

3. Differences in Stack Implementation 

Good programming practices followed with the goal of writing 
ir iplementation independent code. Where good practices are followed, minor differences 
in the underlying implementation are usually unimportant Sometimes, the form of an 
offending statement is subde. One such example was discovered during this porting 
project 

The add^athO function is a simple procedure located in the udlities.c file. It is 
called from twenty-seven places in the controller code. Its purpose is to append the path of 
the data directory (DATA_AREA) to the front of any file name passed to it so that a pointer 
to the entire construct can be passed to an open() system call. The form of this function used 
on the ISIV/4.3BSD platform had worked as intended for years. The function is reproduced 
below. 

char *add_path(filename) 

( 

char path [MaxPathLength + MFNLength + 1]; 

strcpy(path, DATA^AREA); 
strcat(path, filename); 
return (path); 

} /• end add_path */ 

This same function compiled but caused numerous run-time errors on the new 
Sun/SunOS system. The problem results because the storage for “char path” is created 
within the add_path function rather than in the original calling routine. The pointer to path 
which is returned to the caller is corrupted because of differences in stack implementation 
used by Sun. The problem was corrected by modifying add_path() to accept both the 
frlename and the storage space for the path from the calling routine. 

This is included here as an example of the subtlety of programming problems 
which arise because of operating system implementation differences. 


18 







- 4 . Path Name and File Name Considerations 

Persons involved in porting MDBS to other platforms should be aware of the 
following limitations coded into the current implementation. Commdata.def, located in the 
COMMON directory, limits the maximum length of Hie names (40) and path names (40). 
The configure.h header file, located in the “version (/.e. greg)/bin” directory allows 
backend and controller path names of length sixty-four. No changes were necessary for this 
porting project, but some operating systems may impose more restrictive limits. 

C. C LANGUAGE COMPILER & LINKER CONSIDERATIONS 

1. The C Language Library 

Another source of concern for this porting project is compatibility of C language 
library header files between the old and new compilers. It is the header files which 
determine the availability of library functions, the names of symbols, the format of data 
structures, and the specification of communication sockets. Only those functions and 
definitions contained in header files referenced by the MDBS code are of concern, but 
significant differences in these areas could result in a greatly increased porting effort. The 
MDBS code references seventeen system-supplied header files. Tiese are listed below 
alphabetically. 


Table 2: HEADER FILES REFERENCED BY MDBS 



19 














Table 2: HEADER FILES REFERENCED BY MDBS 


included header files 

ndl»n.h 

netdb.h 

netinetAn.h 

stdio.h 

strings.h 

sys/file.h 

sys/socketh 

sysAime.h 

sysAypes.h 

sys/un.h 

time.h 


An analysis of these header files hrom the old and new systems revealed only 
minor differences. Examples which resulted in trivial changes to the code include the 
substitution of unsigned character for simple character types in stdio.h. Some of the Sun- 
supplied headers, including sockets.h, are actually licensed copies of the BSD code. Both 
sets of headers contain some '‘enhancements”, but for the most part, these changes are not 
important to MDBS. 

2. Type Conversion 

C is not a strongly-typed language, but its type-checking capabilities have 
evolved over time [Kemighan and Ritchie, 1988, p. 3]. If the two compilers do not handle 
type-checking and type conversion (especially implicit type conversions) in the same 
manner, then compatibility problems could result. 

Two classes of compile-time errors and warnings resulted from this difference in 
type handling. The first grew out of the use of the statement, FILE *open(), which was 


20 






















repeated at five locations in the original code (iig.c, iigdbl.c, bes.c, rectag.c, and dio.c). The 
(»iginal compiler allowed this redeclaration of the return type from the open() system call. 
The newer compiler did not The impact was type mismatches in the way file pointers were 
obtained and used. The fix involved rewriting these routines to use file descriptors with 
openO, readO, writeO* and closeQ calls rather than file pointers. This was preferable to 
using the fopenO library call to obtain file pointers which could be used with freadQ, 
fwriteO, and fcloseQ library calls because of the increased q)eed of the unbuffered system 
calls. 

A second class of compile-time problems resulted from differences in the 
handling of the automatic conversion of incompatible data types. The older compiler 
allowed, and the original MDBS code included, numerous examples of implicit conversion. 
One common case involved the interchange of an integer and a pointer. This practice was 
allowed in the original definition of C, but is no longer permitted [Kemighan and Ritchie, 
1988, p. 3]. The second common case involved inconq>atible structure pointers. An 
example from beno.c had a pointer to one type of user-defined structure (type 
ctt.definition) pointing to a very different user-defined structure (type ciat.definidon). The 
solution to both problems was the addition of careful explicit type conversion. 

One curious example of implicit type conversion which did not raise an exception 
at compile-time but caused problems at run-time was discovered. The following statement 
was included in the source code for the backend put (BPUT) process: 

printf(*The host is %s\n*, insg[2 * NoBElength + 2]; 

Since msg[n] (where n is an integerO is a character, not a string, the %s should be a %c. 
This code compiled and ran acceptably on the old system. On the new system, it compiled 
without warning but caused the pr ocess to terminate at run-time. 

3. Error-Chedcing 

A newer compile should have enhanced error-checking capabilities. This held 
true in the case of this porting project The newer compiler identified three errors which had 


21 







not been detected by the older compiler. Two of these involved missing macro names. 
MDBS source code includes a number of debugging statements whose inclusion is 
controlled by #ifdef statements. Two of these were undeHned, meaning that those sections 
of debugging code had never been included. The third error involved an unmatched 
parenthesis. 

4 . The Linker and the Makefiles 

The linker performs complicated actions on the object modules under the 
direction of the user-supplied makefiles. A wide range of compatibility problems are 
possible. First, the newer Sun operating system supports dynamic linking (run-time 
linking), something not supported by the older BSD compiler. The impact of this on the 
MDBS code was unknown. This feature seems to have no impact on the functioning of the 
MDBS software. 

Certainly, a careful review of the compiler and linker options (flags) specified in 
the makefiles is in order any time an application is moved from one platform to another. 
Many of these flags are nearly universal in meaning, but others are implementation 
^lecific. The original MDBS makefiles specified the -20 option. This instructs the compiler 
to optimize code for the Motorola 68020 processor. This is inappropriate for the new 
system and would cause compile-time errors. These were removed ^m all forty makefiles. 

One unexpected error which did surface involved the linking of the record 
processing (RECP) executable. Possibly because of its large size, the makefile directed the 
linking of recp.exe in two steps on the older system. This produced a program which always 
generated a segmentation fault and dropped core when executed on die new system. The 
problem was corrected by modifying the makefile to allow one-step linking. 

D. IN CONCLUSION 

In this chapter, we have presented seven classes of operating system and language/ 
compiler issues relevant to the porting of the Multibackend Database Supercomputer 


22 






software. In the next chapter, we will consider issues relevant to hardware differences of 
die new platform. 


23 





IV. HARDWARE ISSUES 


A. AN INTRODUCTION 

Unlike some other database machines [De Witt, 1979, p. 122 • 132], the Multibackend 
Database Supercon^uter was designed from the start to utilize only off-the-shelf hardware 
and identical standard system software. The positive result is a mercifully small numbCT of 
hardware considerations relevant to the task of porting the software to a new hardware 
platform. Two significant hardware issues remain, those related to the q[>timal use of the 
frxed disks, and those related to the storage of hardware qrecific infcmnation within the 
code. 

B. nXED DISK ISSUES FOR MDBS 

By definition, database machines are not general-purpose con^uters. Database 
machines are specialized for the tasks they must perform. A principal distinguishing 
characteristic is the cr^bility to handle very large amounts of stored data quickly. The 
MDBS architecture emphasizes the efficient data handling capabilities of large, fast disk 
drives in each backend machine. 

1. Fixed Disk Requirements 

A typical MDBS configuration would involve one frxed disk drive in the 
controller and three fixed disk drives in each backend machine. The controller drive is 
mounted (attaching the file system to the directory structure) so that it can be used to store 
and execute the MDBS programs in the normal manner. Any frxed disk with at least 
twenty-three megabytes of usable capacity wUl suffice. One of the three backend drives 
should also be mounted. This disk need only store the approximately two megabytes of 
code required by a backend machine. The other two backend drives should not be mounted, 
as these will be accessed as raw devices. The size of the first of these two raw drives, the 
base data disk, will be determined by the amount of data to be stored divided by the number 
of backends over which it can be shared. The second backend disk, the meta data disk. 


24 






should be about one-fourth the size of the base data disk. In the Naval Postgraduate 
School’s Laboratory for I^base Systems Research the base data drives have been as large 
as four hundred megat^s to as small as one hundred megabytes each. Ltregaidless of the 
capacity, it is in^>ortant that all of the attached backmds have identical base data drives and 
identical meta data drives. The following paragrq>hs will make the reason for this 
requirement clear. 

2. Understanding MDBS Disk Utilization 

Understanding how MDBS utilizes it’s fixed disks is fundamental to the rest of 
diis presentation. As stated above, the controller machine’s disk and each backend 
nuK;hines’ program disk are mounted drives. These are used to store the source code, 
executable code, shell programs, transaction Hies, and temporary files listed in Appendices 
Cand D. 

The meta disk and data disk are unmounted drives. They renuun raw devices so 
that they can be used as character special flies by the MDBS software. In essence, the entire 
flxed drive is viewed as a qiecial kind of Ale by the operating system. It is not uncommon 
to use this approach for database applications on UNIX systems, since it allows the 
database management system (DBMS) to by-pass the file system and directly transfer data 
between dte processes’ address space and the flxed disk using direct memory access 
[Rochkind, 1985, p. 3 - 4]. The result of bypassing the buffered file system is greatly 
improved performance. 

The significant performance gain possible with character q>ecial files does not 
come without a price. Raw devices cannot use most of the convenient file handling utilities 
provided by the operating system. It is necessary to write the routines to handle the storage 
and retrieval of data on die disk. Frequently, these routines are gathered together into a 
collection of subroutines which can be called directly by the operating system kernel. This 
is referred to as a device driver [Rochkind, 1985, p. 4]. The routines for accessing the meta- 
and base-data disks on MDBS ate not unified into a device drivo’. Radier, they are 


25 











contained in the source code for the directory management (DM, or DIRMAN) process, 
which handles meta-data disk access, and the disk input-output (DIO) process which is 
req>onsible for access to the base-data disk. 

3. Imptemcntation Details of the MHa Disk 

Meta-data describes the structure of dte primary information stored in the 
database. The nteta-data disk stores this information about the base (primary) data using a 
q)ecia] format designed to take advantage of the q>eed of raw devices used as character 
^cial files. 

The beginning of the device contains two global tables, known collectively as the 
“header”. The first of these tables, the “Next Available Track Table” (NATT) is used to 
store the information about the next available (unused) disk track on the data disk which 
may be used for starting a new data cluster. The NATT starts at byte zero of the device. The 
other table, the “Offset Table” (OT), is a list of database identifiers for databases which 
have data stored on any backend machine. Hie OT begins at byte three of the device. The 
entire header is of fixed size. The current value is 500 bytes. All of the values quoted here 
are stored in the meta.def file located in the DM directory. 

The header area is followed by space for one or more sets of tables. Each set of 
six tables contains information about a single database. These six tables are the Descriptor- 
to-Quster Bit Map Table (DCBMT), the Templates list, the Attribute Table (AT), the 
Descriptor-Descriptor ID Table (DDIT), the Descriptor Table (DT), and the Quster- 
Definition Table (CDT). Each table has a pre-determined length specified in the meta.def 
file. 

#d«£in« NATT_OFFSET 0 
#d«fin* OT_OFFSET 3 
«d«£in« HEADER 500 

Figure 6, on the following page, shows the organization of the meta disk. Identical meta¬ 
data tables are maintained on each backend machine. Each meta-data disk contains a 
complete set which is identical to every other. 


26 






Item Contents Starting Address 


Header 

Block 

NATT 

NATT.OFFSET 

OT 

OT.OFFSET 

■ 

DCBMT 

HEADER4<DB_ii0*DB_LENGTH)+DCBMT_OFFSET 

Templates 

HEADER+(DB_no*DB_LENGTH)+TEMP_OFFSET 

AT 

HEAraER+(DB_no*DB_LENGTH)+AT_OFFSET 

DDIT 

HEADER+(DB_no*DB_LENGTH)+DDIT_OFFSET 

DT 

HEADER+(DB_no*DB_LENGTH)+DT_OFFSET 

CDT 

HEADER>KDB_ik)*DB_1£NGTH)+CDT_OFFSET 

The meta-data blocks for other databases follow sequentially 


Figure 6. Meta-Data Disk Storage Format 

Once the disk is opened using the open() UNIX system call, information 
pertaining to a database may be written (writeO) or read (readO) from the disk. The location 
of the meta information may be simply calculated by adding the header size to the offset 
needed to reach the desired table in the nth database, where n is the numerical 
representation of the order of the database ids contained in the Offset Table. The routines 
for opening and using the meta-disk are contained in the meta.c source file located in the 
DM directory. The directory management process, running on each backend machine, 
carries out these activities at run-time. 

4. Implementation Details of the Base-Data Disk 

The base-data (primary data) is the actual data of interest to the user of the 
database management system The base-data disk is also configured as a character special 
file, but here the similarity ends. The base-data is stored on the base-data disk in a manner 


27 































completely different than that of the meta-data. The base-data storage scheme is designed 
to q^read die records of a database across the backends as evenly as possible. Base-data 
information is not replicated. On each backend machine’s base-data disk, records are stored 
in one or more tracks of the same disk cylinder. During a retrieve operation, all of the 
attached backend machines would be performing their reads at the same time. This 
arrangement achieves a primary high-performance design goal for MDBS, cylinder 
parallel readout of data [Hsiao, 1991, p. 50 • S3]. Figure 7 depicts how four records (x), 
residing on different surfaces of the same cylinder, may be efficiently read from a data-disk 
drive. 



Figure 7. Record Distribution on a Base-Data Disk Drive 


Unlike the meta-data storage scheme, which was based on the simple calculation 
of byte offsets into the device, the complicated base-data storage scheme relies on an 
understanding of several physical disk parameters (track size, tracks per cylinder, number 


28 





of cylinders) to arrange the records into the desired tracks of the disk for optimal reading 
and writing. Again, the basic storage unit of the base-data disk is the track, not the byte. 

This dependence on knowledge about physical disk details makes the code for 
the base-data disk the most implementation dependent of all of the MDBS code. 
Fortunately, this information has been gathered into three header files which are listed in 
sub-section 7 below. 

The routines for opening, reading, and writing the base-data disk are contained in 
the disk input-ouqi>ut (DIO) directory and executed at run-time by the DIO process. Each 
backend machine knows its next available cylinder and track (if any). Decisions as to which 
backend machine will receive a newly inserted record are made by the insert-information 
generation (IIG) process on the controller after coordination with the backend DM 
processes [Boyne and Demuijian and Hsiao and Ken and Orooji, 1983, p. 29]. 

5. Disk Initialization 

Before their first use, both the meta-disk and data-disk must be initialized. The 
initialization process is repeated whenever a database is removed from the system. This 
process involves opening the devices and writing null zeros to all or a portion of the device. 
For MDBS, this process is refened to as “zeroing” the disk (Note that only meta- and data- 
disks should be zeroed - never zero one of the program disks, as this destroys all of the code 
stored on the disk!). 

Initialization is handled by an MDBS utility named “zero”. The executable code 
for zero should be copied to the “mdbs/bin” directory on each backend. A master copy of 
the zero executable is maintained in the “version (e.g., greg)\bin” directory on the 
controller. The source code for zero (zero.c) is located in the “DIO” directory on the 
controller. Zero is normally called and passed necessary information (device to zero, 
number of bytes to zero) by the shell programs located in the “run” directory of the 
controller. 


29 






6. The Problem and the Solution 

The aforementioned methods for initializing, writing, and reading data work well 
on the old hardware platform. The old fixed disk drives utilize the Enhanced Small Device 
Interface (ESDI). The disks’ formatting information is stored on the drive controller, and 
the entire surface of the disk is available for use by MDBS. 

The new hardware platform is equipped with disk drives utilizing the Small 
Computer Systems Interface (SCSI). SCSI is a system level interface, meaning that some 
of the controller functions must be built into the drive circuitry [Rosch, 1989, p. S50]. When 
one of the new SCSI drives are operated under the SunOS 4.1.1 operating system, the 
formatting information for that drive must be written to the disk. Sun refers to this 
information as the “label”. It is located in the first block (cylinder 0, head 0, track 0) of the 
disk. This represents an implementation dependency with respect to the existing MDBS 
code, since the existing routines for initializing and writing to both the meta-data and base- 
data disks destroy the formatting information stored in the label block on the new drives. 

With regard to the disk initialization routine contained in zero.c, the solution is 
straight-forward. The old version of zero starts at the first byte Oocation 0) of the drive and 
writes null zeros. We have added a new constant, called “SAFETY_OFFSET’, which is 
currently defined in meta.def to be 512. The lseek() system command which zero.c uses to 
position the file pointer, now begins writing null zeros at location SAFETY_OFFSET, 
rather than at location 0. The label block of the drive is passed over and preserved. 

The modification to the meta-data disk writing routines is also straight-forward. 
The old routines write the header information beginning at location 0 on the meta-data disk. 
To avoid destroying the label information, the same SAFETY.OFFSET is added to the 
meta-disk before the header block. This is accomplished by adding a new definition 
(SAFETY_OFFSET) to the meta.def header file and “sliding” the other blocks further into 
the device. The advantage of this approach is that it is transparent to the rest of the existing 
code. The routines for the calculation of table locations reference the definition of 
HEADER, so only the following changes are necessary: 


30 





«d*fin« SAFETY.OFFSET 512 

#defin« NATT_OFFSET 0 + SAFETY_OFFSET 

«d*fine OT_OFFSET 3 + SAFETY_OFFSET 

#defin* HEADER 500 + SAFETY.OFFSET 

Figure 8 shows the new organization of the nwta-data disk. 


Item 

Contents 

Starting Address 

Safety 

Offset 

Label 

0 

Header 

NATT 

NATT.OFFSET + S AFETY.OFFSET 


OT 

OT_OFFSET+ SAFETY.OFFSET 


DCBMT 

HEADER+(DB_no*DB_LENGTH)+DCBMT_OFFSET 

Meta-Data 

Templates 

HEADER+(DB_no*DB_LENGTH)+TEMP_OFFSET 

Block 

for 1st 

Database 

AT 

HEADER+(DB_ik)*DB_LENGTH)+AT_OFFSET 

DDIT 

HEADER+(DB_no*DB_LENGTH>fDDrr.OFFSET 


DT 

HEADER+{DB,no*DB_LENGTHHDT_OFFSET 


CDT 

HEADER+{DB_no*DB_LENGTH)+CDT_OFFSET 


The meta-data blocks for other databases follow sequentially 


Figure 8. Modified Meta-Data Disk Storage Format 

The changes required to protect the label area on the base-data disk require a 
different approach. It does not make sense to try to follow a byte-based approach to 
protecting the label block, since the existing storage routines are based on cylinders and 
tracks. Likewise, an approach which tests every insert operation for indications that a write 


31 











to the label block is about to occur is not satisfactory since this would impose the overhead 
of two extra comparisons per write. 

The Next-Available-Track-Table (NATT) on the meta-data disk stores the next 
available cylinder and next available track information for the backend on which it resides. 
Whenever a backend does not already contain records from an existing database, these 
values are both zero. This is because the initialization routine (zero) writes zeros to every 
location on the disk. Two global external variables, av.cylinder (type unsigned short) and 
av.track (type unsigned char) are declared in dirman.def for the purpose of receiving and 
holding these values during execution. The information is read into the variables during an 
initialization call to the read_meta_NATT() function in the DM process. 

Since our goal is to preserve the label area of the disk, the obvious way to proceed 
is to ensure that the available cylinder and track variables never indicate that cylinder 0 
track 0 (the location of the label) is available for writing. Rather than modify the routine 
which zeroes the base-data disk, we will add a new function, init_meta_NATT(), to meta.c 
which will write a zero to the available cylinder and a one to the available track portion of 
the Next-Available-Track-Table immediately after the meta-data disk is opened. This call 
will execute before the read_meta_NATT() call so that the global variables receive values 
of zero and one, rather than zero and zero. 

The first write to the base-data disk now skips cylinder 0/track 0 (where the label 
area is located) and writes instead to cylinder 0/track 1. This approach costs one track per 
backend base-data disk, but avoids the complexity of a byte-based scheme and the run-time 
overhead of testing before writing. It also has the advantage of being flexible, in that the 
base-data disk's beginning cylinder/track address can be changed to suit the user's need 
simply by changing the values of the new constants “first_rccoid_cylinder” and 
“first_record_track” contained in dirman.def. 


32 







7. Hard'Coded Disk Information 

As discussed above, certain implementation-dependent flxed-disk information is 
hard-coded into MDBS. All of the Hxed disk related definitions are contained in four 
header files. The following definitions are from the commdata.def file in the CXDMMON 
directory: 

#define RecDiskSize 95421000 

#d«fin« no.tracks 6 

tdefin* TrackSiz* 8192-2 

RecDiskSize refers to the size of the new base-data (record) disk. The definition for 
no.tracks is the number of tracks per cylinder (the number of heads) on the new disk. 
TrackSize is the segment size for the storage of records on each track of the base-data disk. 
It is included here because this was originally the same as the formatted track size of the 
disk, but this changed as a result of performance testing several years ago. Changes to this 
value should be carefully considered, as there are numerous side effects. 

The dio.h file in the DIO directory contains the device name for the base-data 
(record) disk in the following statement: 

char *driver_naines * { */dev/sd4c*); 

This same value is written into the code for the disk utility rectag.c, and each of the zero.db** 

shell scripts located in the “run” directory. 

The following definitions are contained in the meta.def header file (these are in 

addition to the structure information stated earlier): 

#define META_DISK_NAME '/dev/sd2c' 

#define META_DISK_STORAGE 95421000 

The META.DISK.NAME is the device name of the meta-data disk. The value specified 
for META_DISK_STORAGE is the formatted storage capacity of the meta-data disk. 

The final header file containing fixed disk ^cific information is dirman.def in 
the DM directory. It contains the following definition: 

#de£inc no_cylinders 974 

This is the numbor of cylinders on the base-data (record) disk. 


33 






C. OTHER HARDWARE SPECIFIC ISSUES FOR MDBS 


While the physical details of the fixed disk drives are the most obvious hardware- 
specific information coded into MDBS, there are other details important to a porting 
project Details of the workstations and network hardware comprising MDBS must be 
considered when moving to a new hardware platform. 

1. Workstation Information 

One major area of concern involves the storage and use of information (e.g., host 
names) relative to the MDBS workstations. The maximum number of workstations 
attached to the MDBS ommunication network, their specific names, and the format 
(alphabetic characters and numbers) of their names are all hard-coded into MDBS. The 
maximum number of backend machines is explicitly coded in two places (once as an 
integer, once as a character) in pcLdef (located in the COMMON directory). The character 
version of the number is not used by the “greg” version of the software. The maximum 
number of backend workstations is implicitly defined in another place in the code. This 
statement from ti.c Oocated in the T1 directory) sets a condition for a successful start-up of 
the system: 

if (NOBEtO) > '0' && NOBEtO] < '3') { 

This number should be one greater than the maximum number of backend machines. The 
dbl_template() function in the dbl.c source file contains several statements which are highly 
dependent on the number (and name) of the workstations comprising MDBS. The values 
associated with the variable buMP and it*s length (buMPno) are critical. This function 
should be carefully analyzed by anyone porting MDBS to a new hardware platform, as it is 
very implementation dependent 

A complete set of the host (workstation) names are written into the configure, h 
file. Configure.h is the header file for the executable (main) which automates the start-up 
and shut-down of the MDBS software. The host name of the controller machine is specified 
separately in configure.h and also contained in pcl.def. 










Some routines in die MDBS code are sensitive to the length of the host names of 
the attached workstations. These are routines which pick the names of sending stations 
from messages by relying on the length of tl» station name.The very implementation- 
dependent coding of the dbl.templateO function in dbl.c has already been mentioned. 
Another example is found in ack.c (from the COMMON directory), which incltules this 
statement at about line 626: 

for (i ■ 0; i < 4; !••■•*•) 

loudmouth(i] « hp -> h_nam*[i]; 

The number 4 represents the number of characters (alphabetic and numeric) in the host 
names (e.g., dbl3 is four characters long). The number 4 has subsequently been replaced 
by the definition host.name.length. A similar example is written into pcl.c (also located in 
the COMMON directory). Here the implementation dependent statement in the 
get.first_message() function is: 

for (i=0; i<4; !♦+) 

brdcatn Q host [i] = hp -> h_naine[i]; 

Here, also, the number 4 results from the number of characters in the host names. Because 
of statements like these, all of the host names of the machines making up MDBS should 
have the same length. The current implementation is set up for host names of length four, 
but this could be changed. The constant MAX_HOST.LEN, contained in msg.def (located 
in the (TOMMON directory), could be used to describe the length of host names. 

Some sections of MDBS code are sensitive to the format of the host names (e.g., 
“dbl 1”). Ack.c, located in die COMMON directory, is a principal source file for routines 
which acknowledge messages broadcast by the controller or a backend machine to all other 
machines. Some details about the implementation of reliable broadcasting for MDBS are 
located in fWong, 1986, p. 38 - 44,61]. The routines of this file are very highly dependent 
on the composition of host names. As originally implemented, these routines expect to deal 
with host names of length three, where the last character is a unique number which can be 
converted into an integer. The original code in ack.c also assumed that the highest number 
portion of the host name would never be larger than the maximum number of workstations 


35 










attached to MDBS. We have made those alternations necessary to make the code easier to 
port to different hardware with host names of other lengths. New constants have been added 
(e.g., lowest and highest numbered workstations) to logically separate the range of the 
number portion of the hostnames from the range of index values for the host.names array 
(the host_names array is a two-dimensional array which holds the host names of all 
workstations participating in a given run of MDBS). This separation improves portability 
and allows the efficient use of higher-numbered workstations. An illustration of this is 
provided in the assembleO function located in the ack.c source file. These additions also 
improve readability. 

The new definitions include host_name_len (the length of the host name string), 
min_ws_number (the number portion of the least workstation host name), max_ws_number 
(the number portion of the greatest workstation host name). The values associated with 
these definitions, located in ack.def and ack.dcl, would have to be adjusted for a different 
set of workstations with different host names. The current implementation is set up for a 
hostname of length four, of which the last two characters are numbers. No two hosmames 
should end in the same numbers as MDBS uses these to uniquely identify each workstation. 

A new function, host.name JntegerO, which receives the hosmame string and 
returns the number portion as an integer has been added to ack.c as a replacement for the 
old method. The source code for the new, nrore portable function is contained in Appendix 
F. 

Almost all of the shell scripts associated with starting and stopping MDBS 
contain host name or device dependent information. These simple files, located in the “run" 
directory on the controller, include all of the stop.be*, zero.db*, and stop.db* Hies. All will 
require slight modification when MDBS is moved to a new hardware platform. 

2. Network Communication Information 

The third major source of hard-coded, implementation specific details in the 
MDBS code is the communication network. The name by which the network may be 


36 







accessed and inqKntant hardware pon assignments are specifically written into the code. 
More will be said about this in the next chapto^. 

D. IN CONCLUSION 

In this chapter we have brought forth the dvee primary sources fox hardware-specifle 
statements in the MDBS code. A specific modification to the way in which the raw devices 
are written has also been described. In the next chapter, we will present details of the 
communications issues important to porting MDBS to a new platform. 


37 




V. COMMUNICATIONS ISSUES 


A. AN INTRODUCTION 

The third major issue which defines the size and scope of the porting task for anyone 
moving the Muitibackend Database Supercomputer to a new hardware and/OT software 
platform is convnunications. The loosely coupled parallel architecture of MDBS is very 
highly dependent upon conununications. This communication requirement exists between 
the processes running on any one workstation and between processes running on different 
wOTkstations. This ch^ter starts with a presentation of the communication requirements of 
MDBS. Next, the design of MDBS communications is discussed. Fmally, details relevant 
to specific implementations are provided. 

B. MDBS COMMUNICATIONS REQUIREMENTS AND IMPLEMENTATION 

MDBS requires both inter-process and inter-machine communications support. Inter¬ 
process communication (IPQ is communication between the processes running on a single 
workstation. The controller depends on reliable inter-process communications to 
coordinate the actions of the six processes running concurrently on the controller 
workstation. Each backend machine depends on reliable inter-process communication to 
coordinate the actions of the six backend processes. 

I. Inter-Process Communications 

The current implementation of MDBS supports inter-process communication 
through the facilities offered by the old (BSD4.3) and new (SunOS 4.1.1) UNIX operating 
systems. MDBS uses sockets of type SOCK.STREAM in the UNIX domain (AF.UNIX) 
under the Transaction Control Protocol (TCP) far ctmununications between processes on 
the same machine. The TCP communications protocol is reliable, meaning that there is no 
need for MDBS to check for the delivery, sequertcing, or duplication of messages sent using 
this protocol. A message which is transmitted may be assumed to be delivered successfully. 
This is of great importance to a database system where data integrity is a central concern. 


38 




An inter-process communications channel is established asymmetrically using 
the client-server model [Leffler and Fabry and Joy and Lapsley and Miller and Torek, 1987, 
p. PSl: 8-2 - 8-10]. The client makes the system call, socketO, which creates an endpoint 
for ccmununication and returns a descriptcv. Tte client then attempts to connect to a server 
using the connectO syston call. The server also creates a socket and then uses the bind() 
system call to assign a name to the socket Tlw server then listens for a connection attempt 
by the client process. Once the connection has been established, simple read() and writeO 
system calls may be used to transfer data. These messages are not limited to a specific 
length by the operating system. These messa^s are written into buffers created by MDBS. 
The size of the MDBS buffers, not the operating system, limits the maximum size of the 
messages. 

2. Inter-Machine Communications 

Individual MDBS workstations need to be able to send messages to other 
individual machines as well as Ivoadcast a message to all other workstations (backend and 
controller alike). The current implementation of MDBS also uses the support offered by 
UNIX for inter-machine conununications. MDBS workstations are aU connected to a 
standard ediemet cable. MDBS creates sockets of type SOCK.DGRAM in the Internet 
domain (AF.INET) under the User Datagram Ptotocol (UDP). The sockets of type 
SOCK_DGRAM support connectionless, unreliable messages of a small, Hxed length 
[Rieken and Wieman, 1992, p. 39 - 45,51]. By unreliable, we mean that messages may 
become lost - it is not possible to assume that a message will be received. Since reliability 
is critical to this application, another level of communications protocol was added to 
MDBS to support reliable broadcasting with acknowledgments. Details of this are 
contained in [Wong, 1986, p. 38 - 44,61]. 

The familiar socket() system call is used to create the socket in the Internet 
domain. The connectO system call is used to establish the link. Tlie socket’s address 
includes the Internet address and port number. Messages are transmitted to another socket 







of Qrpe SOCK_DGRAM using the sendQ system call. The 32-bit Internet adtbess is 
autmnatically asagned to all messages defined fw this socket Since the messages are of 
fixed format and length, MDBS uses the readO system call rath^ than the iecv() system 
call to read the messages. 

C. COMMUNICATION CHANNEL DESIGN AND FUNCTION 

Now that the general form of MDBS communications requirements have been 
presented, a discussion of the design and use of conununications channels within MDBS is 
neetted. 

1. Process Functions 

A brief review of the functions performed by the twelve MDBS processes is 
included here as an aid in understanding the communication channel design. Mote detailed 
information about these processes and how dteir functions have changed ovo^ time is 
available in [Boyne and Demuijian and Hsiao and Kerr and Orooji, 1982, p. 3, 29 • 33], 
[Wong, 1986, p. 10,38 - 44], [Hammond, 1992, p. 4 - 5]. The six controller processes are 
controller get (CGET), controller put (CPUT), test interface (TI), request processing 
(REQP), insert-information generation (UG), and post processing(PP). The CGET process 
is responsible for sending IXjRAM messages across the ethemet to other MDBS 
workstations. The CGET process is responsible for receiving DGRAM messages from 
other workstations. The Tl process is the user interface. It contains the routines for 
activating the selected language interface and capturing the user’s instructions from the 
terminal. The REQP process parses the user’s requests and checks for proper format and 
syntax before forwarding the request The IIG process handles the clustering of die 
database records across the backend machines. It includes a global table of locality 
information (backend number, cylinder, track). The PP process properly formats the results 
received from backend machines so that they can be displayed to the user. 

The six backend processes are backend get (BGET), backend put (BPUT), record 
processing (RECP), concurrency control (CXT), directory management (DM), and disk 


40 



input/ouqput (DIO). Au six of these processes run on each backend machine participating 
in MDBS. The BPUT process is responsible for sending DGRAM messages to other 
MDBS wtHrkstadons over the ethemet cable. The BGET process receives these same inter- 
machine messages for the backend machine on which it is running. The RECP proce» is 
re^nsible for the manipulation of records. This includes selection, retrieval, and value 
extraction. The CC process is charged with maintaining meta-data and base-data (record) 
integrity during the processing of transactions. The DM process is responsible for all access 
to the meta-data disk. It coordinates with the record processing process in gathering 
information about how the base-data (records) are stored. The DIO process handles all 
reading from and writing to the base-data (record) disk. 

2. Design of the Communications Channels 

To support the inter-process and inter-machine communications requirements of 
MDBS, the communications links shown in Figure 9 (on the following page) are 
established using the system calls covered in pan B. Figure 9 shows all of the channels built 
into MDBS, not just the primary ones. Only one backend machine is shown. It is 
representative of all backend machines. Note that the inter-process communication links 
have arrows. This is intended to show which process initiates the link, not the direction of 
information flow, since the SOCK.STREAM connections are bi-directional. The arrows 
associated with the SOCK^DGRAM sockets in the PUT and GET processes do not imply 
an actual connection, but do show the direction of message flow. 

3. How MOBS Establishes Communications 

All of the communication channels ^own in Figure 8 are established during 
system generation (start-up). The MDBS code logically separates the establishment of 
inter-process and inter-machine communication. The inter-process communications are 
established within each workstation rirst The IPC links within each machine are handled 
by the send-receive initialization routine (initsrO) contained in the sndrcv.c source file 
located in the COMMON directory.. 


41 









Figure 9. MDBS Communication Channels 


42 






Once the inter-process channels are established, the inter-machine links are 
created. The initcbO (initialize controller-backend communication) function in the cb.c 
source file and other functions in the pcl.c source file (both located in the COMMON 
directory) handle the establishment of inter-machine conununication. Within initcbO, the 
initbtocO function sets up the connectionless link and the init_ack_put() function sets up 
the additional reliability layer built into MDBS which allows reliable broadcasting. Each 
backend machine needs to identify itself to the controller before the controller’s permanent 
socket is created. This is handled through the creation of a temporary, “universal” socket 
in the controller (using the init_servO function in pcl.c). The universal socket is replaced 
by the permanent, dedicated controller socket once the controller knows how many 
backend machines are presently configured. 

The job of establishing the inter-machine communication channels start in the 
controller. During initialization, the TI process tells the CGET process to set backend 
numbers with a SeiNoBE (message code 923) message. The CXjET process then receives 
an initial identification message (BeWho, message code 925) from a backend over the 
ethemet CXSET forwards this message to the CPUT process, which transmits a message 
(SetNoBE, message code 925) back to the identified backend over the ethemet. 
Initialization is complete when that backend’s BGET process sends an inter-process 
message with the backend number (SetBeNo, message code 924) to the CC, DM, RECP, 
and BPUT processes. These and other process and message related codes are contained in 
the msg.def header file. 

Shutdown is accomplished by the finish send-receive (fmishsrO) function 
and the close socket (closesockO) function. C^ls are made to these functions when system 
shutdown is ordered by the user or a critical communications failure occurs. Note that the 
inter-process communication sockets are unlinked immediately following creation. This 
does not close the socket as long as the associated process remains viable. This does ensure 
the socket is not left open if the process terminates abnormally [Stevens, 1992, p. 96]. 


43 






D. DETAILS IMPORTANT TO PORTERS OF MDBS 


This section contains coinmunications-related infonnadon of particular importance to 
anyone porting MDBS to another platform. 

1. LimitaticHis on Message Lengths 

There are limitations to the lengths of both SOCK_STR£AM and 
SOCK.DGRAM messages of which the porter needs to be aware. The maximum length of 
the inter-machine messages handled by SOCK.DGRAM sockets is set by the operating 
system. UNIX limits such messages to 1450 bytes. This limit is hard-coded into the MDBS 
code as the constant BRDCSTSZ (broadcast size) in pcl.def Oocated in the COMMON 
directory). This value might have to be changed on a different operating system platform. 

The maximum length of the inter-process messages handled by SOCK_STREAM 
sockets is not determined by the operating system, but rather by the amount of buffer ^ace 
set aside by MDBS for messages. The constant, MSGLEN (contained in both msg.def and 
licommdata.def), limits inter-process messages to approximately eight kilobytes. This 
value is related to the track size of the base-data disk. This limit might have to be changed 
on a different platform. If a change to the buffer size is necessary, the values of the 
following definitions will also have to be changed: RESTMSGLEN (located in msg.defln 
the COMMON directory), PP_ResBufSize (located in pp.def in the PP directory), 
RESLength and REQLength (both located in tstintdef in the T1 directory). 

2. Access to the Network 

MDBS must be able to gain access to the network and communication ports in 
order to implement inter-process and inter-machine communication. Access for inter¬ 
process communication is considered Brst The pcl.def file, located in the (TOMMON 
directory, contains the hard-coded name of the network joining the MDBS workstations. 
The constant NETNAME (currently “npsisnet”) is the name by which MDBS accesses the 
network. The getnetbynameQ system call, issued from pcl.c, establishes the access. 




Whatever network name is used, it must be reflected in both pcl.def and the UNIX 
operating system’s network database. This database is stored in the /etc/networks file. 

Inter-machine communications, in the Internet domain, requires additional, 

implementation-dependent information. Setting up the inter-machine communications 

requires access to the system's network database file, just as the inter-process 

communications did. This is because the network’s 32-bit internet address is recorded 

there. Communications in the Internet domain also require a port number. MDBS code 

includes the hard-coded numbers for four ports. Pcl.def defines the port numbers for the 

backend and controller ports: 

#define BE.PORT 1650 

#def.ine CNTRL_PORT 1651 

Ack.dcl defines the port numbers for the retransmission and acknowledgment ports: 

#define RETPORT 1700 

#define ACKPORT 1800 

These port numbers must be assigned by the system administrator and will be different in 
every implementation. 

3. Socket Dermitions 

The operating system’s definitions for the sockets it supports arc contained in the 
sockets.h header file. The old and new operating system’s versions of this file should be 
carefully compared. For this project, no changes were needed. 

4. The Socket Directory 

The location for the controller and backend sockets is specified in the MDBS 
code. The sndrcv.def header file includes the following definition: 

#define PREFIX '/u/mdbs/Sockets* 

Any change to the location for the sockets in the controller or backend directory structure 
must be reflected here. 


45 












5. A New Aid for Timing Messages 

An additional debugging flag has been added to the existing ones in the flag.def 
files to allow easier reading of the process trace files. This flag is the send-receive timing 
flag (SRTimeFlag). When included in the flag.def files, it causes the system time (in 
seconds) to be added to the trace whenever any inter-process communication functions 
(sendO and receiveO) or inter-machine communication functions (put_message() and 
get_message()) are called. This timing information can be useful for understanding or 
trouble-shooting the traces belonging to the six processes rurming on a single workstation, 
but are of limited use in making comparisons between different workstations since 
workstation clocks are rarely so closely synchronized. 

E. IN CONCLUSION 

Communications, both inter-process and inter-machine, are central to the functioning 
of the Multibackend Database Supercompu^. Any attempt to move MDBS to a new 
hardware or operating system software platform must carefully consider the 
communications support available on the new platform as well as communications 
parameters hard-coded into MDBS. 


46 






VI. MISCELLANEOUS OTHER ISSUES 


A. AN INTRODUCTION 

The purpose of this shon chapter is to bring out two issues which are important to 
porting the Multibackend Database Supercomputer software, but which did not fit neatly 
into any of the preceding chapters. These two “loose ends” are the storage of MDBS 
directory information, the method for attaching or deleting language interface modules 
from the TI process, and certain limitations in the current MDBS implementation. 

B. MDBS DIRECTORY INFORMATION 

Certain information regarding the directory structure of MDBS is written into the 

code. The full pathnames of both the “home” directory and the data Hies directory (where 

users database schema and transaction files are stored) are hard-coded. The commdata.def 

file in the “version (greg)/CX)MMON” directory includes these two definitions: 

#define DATA^AREA '/u/rodbs/UserFiles/' 

#de£in« HOME */u/n\dbs/' 

The definition of the data files directory is repeated in the licommdata.def file found in the 
“/u/mdbs/version(greg)A2NTRL/TI/LangIF/incIude” directory. 

The full pathname of the directory where the controller communication sockets are 
located is also hard-coded. The sndrcv.def file in the COMMON directory includes this 
definition: 

#define PREFIX 'u/mdbs/Sockets* 

Any change to this directory structure, including a change to the name of the root 
directory CVu”) to which diis file system is mounted, would require changes to the above 
listed nies. The makefiles are written using relative path names, so minor changes should 
not affect the integrity of these files. 


47 










C. ADDITION/DELETION OF LANGUAGE INTERFACE MODULES 


MDBS has flexible architecture in terms of both its hardware and its software. 
Different versions of the software include different mixtures of non-kernel language 
interface modules. The purpose of this section is to identify the critical linkages between 
the kernel and non-kernel interfaces contained within the test interface (TI) code. 

First, for any of the non-kernel language interfaces to be included, the language 
interface flag (LanglF.Flag) must be visible to the compiler. This means that the “#define 
LanglF.Flag” statement in the file “Flags.def ’ located in the Tl directory must not be 
commented out. 

Second, there must be function call to initialize the specific non-kernel language 
interface. This is accomplished by loading the schema for the non-kernel model. This call 
(e.g., creat_rel_db_list) should be located around line 90 in the ti.c file. 

Finally, it is necessary to add a menu choice and a call to the main procedure for the 
desired language interface. This code should be placed within the while loop following the 
function call to initialize the interface. 

Once these changes are made and ti.exe is recompiled (which may require the 
modification of one or more makefiles), the new language interface should be available. 
Assuming the new language interface is properly written, the thread of execution in the TI 
process may now be switched to the new language interface. Removing a language 
interface(s) is accomplished by reversing the above steps. For more information on the 
design of a non-kernel language interface, see [Bourgeois, 1993, p. 29 - 36]. 


48 





vn. CONCLUSIONS 


A. AN INTRODUCTION 

The Multibackend Database Supercomputer offers elegant, practical solutions to the 
four most pressing large database problems facing government and industry. MDBS allows 
data sharing, resource consolidation, scalable performance, and capacity growth. 

B. WHAT HAS BEEN ACCOMPLISHED 

In light of its promise, there is a need for a theory of MDBS system software 
portability. Such a theory identifies specific issues and problem areas for MDBS 
portability. It also increases the understanding of portability issues for parallel databases in 
general. 

In this thesis, such a theory of system software portability has been developed. 
Portability has been addressed from the standpoint of hardware issues, operating system 
software issues, and communication issues. General types of problems as well as specific 
examples of problems and solutions have been presented. It is hoped that this information 
will facilitate the porting of MDBS to other platforms and stimulate interest in and 
development of the concepts embodied in MDBS. We have attempted to incorporate the 
knowledge gained through seven months of studying and experimenting with MDBS into 
this thesis so that the learning curve for future researchers and programmers will be eased. 
We have also sought to point out limitations in the current implementation of MDBS which 
can be corrected or enhanced. 

C. TOPICS FOR FURTHER RESEARCH 

MDBS is a large and complicated system. It has developed significantly over the last 
dozen years, but there is still more which can be done. This section discusses topics for 
further research. 

One area for research involves improving the user and database capabilities of MDBS. 
Currently, MDBS can only support a single user and one database at a time. MDBS needs 


49 





to be able to support multiple users and multiple databases simultaneously. Some work has 
already been done in this regard. The overall design of MDBS supports multiple users and 
databases (e.g. the meta-data and base-data disk organization). Many of the data structures 
related to users and databases are written so as to be easily expandable. This work should 
be continued. 

Another research area is the expansion of the cross-model access capabilities of 
MDBS. Currently, the concept has been proved through the creation of a relational-to- 
hierarchical cross-model access module. Practical cross-model accessibility is possibly the 
most important potential contribution of MDBS. This capability should be expanded and 
studied further. 

A final area for further research is the development of a more intuitive and efficient 
user interface for MDBS. The current command-line interface, which presents the user with 
a long series of sometimes confusing choices, is not a suitable interface for something vrith 
the power and flexibility of MDBS. The user interface issue needs to be thoroughly 
examined. The goal should be nothing less than redesigning the way in which MDBS 
interacts with the user. 

D. IMPLEMENTATION ISSUES 

There is another class of issues which may not involve pure research but which are 
still of great sigmficance to the development of MDBS and its use by students at the Naval 
Postgraduate School. This section presents four such issues. 

First, the multi-model and multi-language capabilities which have been demonstrated 
and used for instruction should be implemented on the multi-backend version of the system. 
The network, hierarchical, and object-oriented model-language interfaces are currently 
implemented only on the single machine version of MDBS. Some effort has already been 
expende o begin moving these interfaces to the multi-backend version. This work should 
be continued, as the implementation of a single multi-model, multi-language, and multi¬ 
backend system will benefit students, researchers, and staff workers alike. 


50 






Second, a review of the value used for TrackSize should be made. It can be assumed 
from its name that this value was once the physical track size of the base-data (recwd) disk. 
This is not now true, nor was it true on the i»revious hardware platform. The value of 
TrackSize is used in the determination of the value of MSGLEN (messa^ length) in 
msg.def. It also figures in the calculation for die maximum number of fragments into which 
an inter-machine message may be broken. Interestingly, the value of MSGLEN in 
licommdata.def is different from the one in meta.def. These inconsistencies need to be 
resolved. 

Third, a review of the number and distribution of header files should be made. 
Currently, definitions are dispersed over a large number of directories. These could be 
combined into fewer files, each with a common purpose. For example, it would be useful 
to have those definitions pertaining to communications parameters in a single file. 

The fourth issue is a small one. That portion of the T1 process which handles the kernel 
model and language, does not automatically generate descriptor (*.d) and template (*.t) 
files for the user. The other language interface modules (e.g. the relational model and 
language interface) generate the descriptor and template files for the user based upon input 
from a description of the database schema. The user of the kernel (ABDL) interface must 
create these files off-line and add them to the system manually. A mechanism for creating 
these critical files should be added to the ABDL interface. 

E. RECOMMENDATIONS 

MDBS is a research tool, not a commercial product It has been the subject of work by 
many different students and researchers over many years. As a result there are a few 
shortcomings in its implementation. This section makes some reconrunendations for 
strengthening the code. 

It is recommended that more structure be added to the coding of MDBS. The current 
implementation of MDBS uses the pre-ANSI (i.e., K & R) C language. This permits an 
ovovse of global variables ami pointers which severely complicates the code and makes 


51 








maintenance or nwdificatiQn difficult It is recmimended that MDB S be rewritten in a more 
structured language (iiKluding, possibly. ANSI C). This would allow greater error 
checking at compile-time, inqnove portability, and make die system much easier to 
understand. 

It is recommended that a higher-quality conqiiler be procured for use by the 
researchers and {nvgrammers. The current command-line compilo' has noticeable 
inconsistencies and provides litde compile-time error checking. 

It is recommended that the commenting of MDBS code be improved. Currendy, large 
areas of the existing code are not well commented. Until recendy, many of the files did not 
even include the name of the fde as a commmL Lack of comments increases the difficulty 
in reading and understanding the code. 

It is recommended that two sections of the code should be re-written to improve their 
portability. The first of these is ack.c Oocated in the COMMON directory). This file was 
written after most of the other MDBS code and contains several strongly implementation 
dependent functions. The second section of code which should be rewritten is the 
dbLtemplateO function in the dblc file. The nature of the dependencies are detailed in 
Chapter IV, section C. 

Finally, it is recommended that a staff programmer be permanently assigned to assist 
with the MDBS project The addition of a programmer/assistant would benefit the project 
the researchers, and the students. This individual could serve as a source for advice on 
progranuning questions, as an expert trouble-shooter, and as the long-term “corporate 
knowledge” for the project’s implementation details. 


52 




APPENDIX A. SYSTEM COMPARISON 


A. HARDWARE 

This following is a summary of the hardware comi»ising the old and new platforms. 
Note that the same standard Ethernet commonic^ons network is used with both systems. 

1. Old hardware 


a. Controller: ISl-V model V24S workstation (quantity one) 


Table 1: OLD CONTROLLER HARDWARE 


Item 

Detail 

host name 

dbg 

central processing unit 

Motorola 68020 (16 MHz) 

physical ram 

4MB 

fixed disk drive(s) 

two Control Data CDC 86L ESDI drives, 
each with 101 MB capacity (MDBS uti¬ 
lizes a 100 MB partition mounted as ‘Vu")« 

tape drive(s) 

one 1/2*’ reel, one 1/4" cartridge 

communications backpiane(s) 

two 


b. Backend: ISi^V model V24S workstation (quantity six) 


Table 2: OLD BACKEND HARDWARE 


Item 

Detail 

host names 

db3, db4. dbS, db6, db7, db9 


53 



























Table 2: OLD BACKEND HARDWARE 


Item 

Detail 

central processing unit 

MotoroU 68020 (16 MHz) 

physical ram 

4MB 

fixed disk drive(s) 

two Control Data CDC 86L ESDI drives, 
each with 101 MB capacity (MDBS uti¬ 
lizes one as a meta disk (raw device) and 
one for {Hvgram code storage (@2 MB. 
mounted as ‘Vu*’). and 
one Control Data CDC Swallow ESDI 
drive with 399 MB cjqpacity used by 

MDBS as a data disk (raw device) 

communications backplane(s) 

one 


2. New Hardware 

a. Controller: Sun model 4/110 workstation (quantity 1) 


Table 3: NEW CONTROLLER HARDWARE 


Item 

Detail 

host name 

dbll 

central processing unit 

Sun Sparc (RISC) 

physical ram 

8MB 

fixed disk drive(s) 

one Micropolis 1558 with 373 MB capac¬ 
ity (MDBS utilizes 100MB on a parti¬ 
tion mounted as *Vu”. 

t^ drive(s) 

none ^ >ckup accomplished via network 
or pw uiDle tape unit) 

communications backplane(s) 

two 


54 






























b. Baekemd: Sun model4/290 workstetioH (quantity two) 


l^ble 4: NEW BACKEND HARDWARE 


Item 

DetaU 

hostnames 

dbl2.dbl3 

central processing unit 

Sun Sparc (RISC) 

physical ram 

16 MB 

fixed disk drive(s) 

one Hitachi DK 815-10 SCSI drive with 
600^ MB capacity (MDBS uses only 
@2MB on file system mounted as ‘7u” for 
program storage) and 
two Quantum ProDrive lOSS SCSI drives 
each with 1(X)MB capacity (MDBS uses 
one as a meta disk and one as a data disk - 
both are unmounted raw devices) 

tape drive(s) 

one 1/4” cartridge drive 

communications backplane(s) 

one 


B. SOFTWARE 

The following is a summary of operating system and compiler software on the old and 
new systems. 

1. Old Software 


Table 5: OLD SYSTEM SOFTWARE 


Item 

Description 

Operating System 

ISIV version of BSD4.3 UNIX 

Compiler 

Standard C compiler (cc) 


55 






























2. New Software 


Table 6: New System Software 


Item 

Description 

Operating System 

SunOS 4.1.1 

Compiler 

Standard C compiler (cc) 


56 














APPENDIX B. AIDS TO DEBUGGING 

A AN INTRODUCTION 

The suggestions offered in this appendix are meant to be of assistance to persons 
iiKxiifying or trouble-shooting the MDBS system software. These hints are loosely divided 
into six classes. 

B. USING DEBUGGING FLAGS 

A very useful and flexible set of debugging routines have been built into the MDBS 
code. These debugging routines, most of which print (printfQ) useful information to the 
terminal or trace files, are included amongst the working code in conditional compilation 
constructs. These take the form: 

tifdef identifier 
(debugging code) 

#endi£ 

The identifier is the name of the debugging flag. The debugging code within the construct 
is only compiled into the executable file if the identifier is defmed in the header file 
specified by the appropriate makefile. These header files are all named flags.def. One is 
located in each of the process directories in which compilation udces place (ten in all). The 
flag names are never actually removed from the flags.v'ef Hies. They are commented out to 
prevent inclusion of the debugging code. Each flags.def file contains a brief explanation of 
the purpose of the flags contained therein. A number of new debugging statements which 
may beneHt persons porting MDBS were added during this porting project 

C. USING TRACE FILES 

Trace files are text files which contain the output of processes which do not write to 
die terminal. Careful analysis of these files is the best way to trouble-shoot MDBS or learn 
exacdy how it performs most of its functions. The startcntrl (controller) and run.be 
(backend) shell programs determine whether a process writes to the terminal or to a trace 


57 







file. Nomully, only the TI process writes to the terminal. The others write to trace Hies, all 
of which end in *.tr. The controller trace files are stored in the vision (e.g., greg)Min/trace 
directory on the controller and the be.version {e.g., greg) directory on each backend 
machine. 

The amount of information which is output to the trace file is determined by the setting 
of the debugging flags for that process. Most of the information in a trace file is readily 
comprehensible once the codes are understood. These codes are contained in the msg.def 
header file. 

Often a clearly labeled error message will direct the reader to the source of a problem. 
Other times it is necessary to study all of the traces together while looking for an abnormal 
pattern. Depicting the message flow on a copy of the communication channel map (Figure 
8 in Chapter V) can be a tremendous help on these occasions. This is especially true where 
the death of one process causes some of the other connected processes to die. Abnormal 
termination of one MDBS process often causes a “domino” effect on the other processes. 

D. CHECKING SYSTEM STARTUP 

A few simple observations can help locate software problems which occur during 
system start-up. Check the Sockets directory to see if all of the processes are creating 
sockets. A new socket is created for every process on every run of the system. Check the 
trace file directories to see if all of the processes are creating their trace files. These are also 
deleted and created on each new run. Check to see if a process identification file (.pid file) 
has been created in the run directory on each machine. If any of these are missing, the 
system initialization is not progressing normally. 

Once the system appears to be up and running, execute a process status (ps) on both 
the controller and the backends. Check to see that all twelve MDBS processes (six on the 
controller, six on the backend) are running. Remember that there is a short delay before the 
get processes are started. 


58 





E. MAINTAINING A HISTORY 

The generation and retention of detailed information about every modification and the 
system’s response is strongly recommended. Use of the UNIX script utility allows the 
trouble-shooter to capture screen output (such as the Tl process output) or keep a permanent 
record of the contents of the trace files from any given run. Scripting the results of the 
execution of the make.results shell program (located in the version/bin directory) produces 
a record of all of the compiler’s messages from the last make. Scripting system tests 
provides a permanent reference. Scripting the trace files for detailed analysis is also 
helpful.Something similar to the tpS.stan and tpl3.stan shell programs (located with the 
trace flies) may be helpful here. 

F. UNIX DEBUGGING TOOLS 

The standard UNIX trouble-shooting tools (e.g., dbx, lint) were of limited assistance 
with MDBS problems. Most of these tools are useful for trouble-shooting a single process, 
but lack the facilities for running twelve processes at once. The dbx tool may be useful 
where one particular process is causing a serious problem such as a segmentation error. 

G. PROGRAMMING TIPS 

When making changes to code which is local to one process (those not contained in a 
COMMON directory), always execute a make (mk*) in that directory before trying a 
system-wide makeall. The system-wide make takes from as little as twelve nrinutes to as 
much as thirty-seven minutes, depending on how much of the language-interface code 
involved. This is a long time to wait only to discover a minor syntax error. 

Whenever adding new debugging code, always immediately follow the printfO 
statement with an fflush(stdout) statement Hus ensures that the output of the printfO 
statement goes immediately to the appropriate trace file, rather than languishing in the 
bu^er where it could be lost in case of process termination. 


59 










Lastly, it is often difficult to determine the time sequence of operations performed in 
different processes. Adding the time(O) call to a printfO statement can provide useful clues 
in the trace files. The time(0) call returns the system time (in whole seconds) as an integer. 




APPENDIX C. CONTROLLER DIRECTORY AND FILE 

STRUCTURE 


The foUowing is a listing of the important directories and files which make up the 
controller on MDBS. 

A. THE DIRECTORY 

This is the “mdbs” directory on die controller. All the other directories descend from 
this one. It also contains the temporary files (*.pid) used to store process identiflcation 
numbers for the controller processes. These files are deleted and then recreated on each run. 
The path to this directory (/u/mdbs/) is hard-coded into the MDBS software as “HOME" in 
the confUTxlata.def ftle. 
total 75 


drwxrwxr-x 

14 

mdbs 

1024 

Jul 

6 

13:05 

./ 

drwxr-xr-x 

13 

root 

512 

Sep 

18 

1992 

../ 

-rw-rw-r— 

1 

mdbs 

4387 

May 

29 

09:59 

.alias 

-rw-rw-r— 

1 

mdbs 

5 

Jul 

6 

13:05 

.cget.exe.pid 

-rw-rw-r-- 

1 

mdbs 

5 

Jul 

€ 

13:05 

.cput.exe.pid 

-rw-rw-r— 

1 

mdbs 

242 

Apr 

26 

12:42 

.cshrc 

-rw-rw-r— 

1 

mdbs 

5 

Jul 

6 

13:05 

.iig.exe.pid 

-rw-rw-r-- 

1 

mdbs 

5 

Jul 

6 

13:05 

.pp.exe.pid 

-rw-rw-r— 

1 

mdbs 

5 

Jul 

6 

13:05 

.reqp.exe.pid 

-rw-r--r— 

1 

mdbs 

114 

Mar 

8 

12:20 

.rhosts 

-rw-rw-r— 

1 

mdbs 

1189 

Feb 

4 

14:44 

.rhosts.bak 

-rw-rw-r— 

1 

mdbs 

5 

Jul 

6 

13:05 

.ti.exe.pid 

drwxr-xr-x 

2 

mdbs 

512 

Jul 

6 

13:05 

Sockets/ 

drwxr-xr-x 

10 

mdbs 

2048 

Jul 

6 

13:05 

UserFiles/ 

drwxrwxr-x 

2 

mdbs 

512 

;^r 

30 

12:07 

bin/ 

drwxr-xr-x 

7 

mdbs 

1536 

May 

10 

19:42 

greg/ 


B. THE “Sockets” DIRECTORY 

The “sockets" directory contains the six sockets used for inter-process- 

communication in the controller. The G_PCLC socket is used by the CGET process. The 

P_PCLC socket is used by the CPUT process. All of these sockets are deleted and recreated 

for every run. 

Sockets: 
total 2 

drwxr-xr-x 2 mdbs 512 Jul 6 13:05 ./ 


61 








drwxrwxr-x 

14 ndbs 

1024 

Jul 

6 

13:05 

. ./ 

■rwxrwxrwx 

1 mdbs 

0 

Jul 

6 

13:05 

G_PCLC 

mxvxxvKTwx 

1 mdbs 

0 

Jul 

6 

13:05 

IIG= 

•rwxrwxrwx 

1 mdbs 

0 

Jul 

6 

13:05 

PPs 

srvixrwxrwx 

1 mdbs 

0 

Jul 

6 

13:05 

P_PCLC> 

•rvQcrwxrwx 

1 mdbs 

0 

Jul 

€ 

13:05 

REQPs 

•rwxrvncrwx 

1 mdbs 

0 

Jul 

6 

13:05 

TI« 


C. THE nJserFiks** DIRECTORY 

The “UscrFiIcs" directory contains the mass load files (*.r) and the controller's copy 
of the descriptor (♦.d) and template (*.t) files for each database used with the system. It also 
contains the schema information files (*db) and stored transaction request files (*req) for 
each database. The one transaction file containing the letters RTH (e.g. SQDRTHreq) is 
part of the “relational-to-hieraichical” cross-model accessing capability of MDBS). This 
controller is set up for use with three databases (i.e. COURSE, SALES, and SQD). For 
details on how these Hies are used, see [Bourgeois, 1993]. The path to this diiectoiy (/u/ 
mdbs/UserFiles) is hard-coded into the MDBS software as “DATA.AREA” in the 
commdata.def. file. The directories under this one (abdm, daplex, hierarchical, etc.) hold 
duplicate copies of these same database files for each model: Their contents are not listed 
here for the sake of brevity. 

UserFiles: 
total 105 


drwxr-xr-x 

10 

mdbs 

2048 

Jul 

6 

13:05 

./ 

drwxrvocr-x 

14 

mdbs 

1024 

Jul 

6 

13:05 


-rw-r—r-- 

1 

mdbs 

13 

Jun 

2 

1992 

.pw 

-rw-rw-r— 

1 

mdbs 

51 

Feb 

25 

21:56 

COURSE.d 

-rw-r—r— 

1 

mdbs 

172 

Feb 

25 

21:25 

COURSE.r 

-rw-rw-r— 

1 

mdbs 

126 

^r 

9 

12:11 

COURSE.t 

-rw-r—r— 

1 

mdbs 

316 

Oct 

19 

1992 

COURSEsqldb 

-rw-r—r— 

1 

mdbs 

613 

Oct 

19 

1992 

COURSEsqlreq 

-rw-r—r— 

1 

mdbs 

289 

Oct 

13 

1992 

SALESreq 

-rw-r—r— 

1 

mdbs 

120 

Oct 

13 

1992 

SALES.d 

-rw-r—r— 

1 

mdbs 

263 

Oct 

13 

1992 

SALES.r 

-rw-r—r— 

1 

mdbs 

121 

Oct 

13 

1992 

SALES.t 

-rw-rw-r— 

1 

mdbs 

43 

Mar 

31 

15:16 

SQD.d 

-rw-rw-r— 

1 

mdbs 

140 

Mar 

31 

15:16 

SQD.t 

-rw-rw-r— 

1 

mdbs 

828 

Feb 

10 

12:35 

SQDRTHreq 

-rw-rw-r— 

1 

mdbs 

813 

Feb 

10 

12:36 

SQDreq 

drwxrwxr-x 

2 

mdbs 

512 

Oct 

6 

1992 

abdm/ 

drwxrwxr-x 

2 

mdbs 

512 

Oct 

6 

1992 

daplex/ 


62 







drwxrwxr-x 

2 

mdbs 

512 

drwxrwxr-x 

2 

mdbs 

512 

drwxrwxr-x 

2 

mdbs 

1024 

drwxrwxr-x 

2 

mdbs 

512 


D. THE “bin” DIRECTORY 


Oct 6 1992 hierarchical/ 

Oct 6 1992 network/ 

Oct 24 1992 relational/ 

Oct 6 1992 »_an«i_f_files/ 


This is the uppermost of two “bin” directories in MDBS (the other is a subdirectory of 
the “version” directory). This directory is used to store utility files, none of which are used 


directly by MDBS. The “cpydisks” script was once used for distributing files. The three 
scripts beginning with ‘z* were used for zeroing meta and data disks. Both functions are 
now accomplished in other ways. 


bin: 

total 134 


drwxrwxr-x 

2 mdbs 

512 

Apr 

30 

12:07 

./ 

drwxrwxr-x 

14 mdbs 

1024 

Jul 

6 

13:05 

. ./ 

-rwxr-xr-x 

1 mdbs 

122 

Jan 

14 

1989 

.z* 

-rwxr-xr-x 

1 mdbs 

84 

Sep 

1 

1989 

cpydisks 

-rwxr-xr-x 

1 mdbs 

48 

Jun 

14 

1989 

2* 

-rwxr-xr-x 

1 mdbs 

108 

Sep 

1 

1989 

zip* 


E, THE VERSION (e.g., “greg”) DIRECTORY 

The “version” directory serves as the top-level directory for each version of the 
software on the system. Each version of MDBS software has a unique copy of this directory 


and all of its subdirectories. The current version of the software is called “greg”, hence the 
name of this directory. For details on version control, see [Meeks, 1993]. 


greg; 

total 1151 


drwxr-xr-x 7 mdbs 
drwxrwxr-x 14 mdbs 
drwxr-xr-x 8 mdbs 
drwxr-xr-x 9 mdbs 
drwxr-xr-x 2 mdbs 
drwxr-xr-x 2 mdbs 
-rw-rw-r— 1 mdbs 
-rw-rw-r— 1 mdbs 
drwxr-xr-x 7 mdbs 


1536 May 10 19:42 ./ 

1024 Jul 6 13:05 ../ 

512 Jun 2 12:50 BE/ 

512 Jun 2 11:46 CNTRL/ 
1024 Apr 23 09:20 COMMON/ 
1024 Jun 2 11:31 bin/ 
1511 Oct 16 1992 d_u 

33666 May 10 19:26 du_a 
1536 Jul 6 13:07 run/ 


63 





F. THE *«vcrsion/bin” DIRECTORY 


This is the second “bin” directory in the MDBS controller. This one contains many 

imporunt files and utilities. The makefile in this directory is used to control compilation 

and linking of all twelve (six controller, six backend) MDBS processes. Executing “make 

clean” followed by “makeall” from this directory will remove all object files from the 

twelve process directories and then create and distribute the new executables. Note that this 

process does not remove the object files making up the “TI” process which are located 

below the “src” directory in the language interface modules. These files must be removed 

manually. The shell script “make.results” displays the compiler and linker messages from 

the last make. The two source fries (main.c, confrgure.h) for the “main" executable, which 

automates the running of MDBS, are located here. The working copy of the main 

executable is copied to the “run” directory. The zero executable is copied here after being 

created in the “DIO” directory. Zero handles the initialization of the backend data and meta 

disks. It must be manually copied to the “bin” directory on each backend machine. C}pcount 

is an executable which copies a specified number of bytes from one file to another. Rectag 

is an executable utility for manipulating the data disk on a backend machine. I must be 

copied to the backend “bin” directory. The cpydisks shell script was once used to distribute 

the executable processes to the backend machines. It has been supersceded by newer code 

written into the “main" executable. The stop.cmd shell script is an older version of the files 

used to stop MDBS processes. The newer ones are located in the “run” directory. 

greg/bin: 
total 422 


drvncr-xr-x 

2 

mdbs 

1024 

Jun 

2 

11:31 

./ 

drwxr-xr-x 

7 

mdbs 

1536 May 

10 

19:42 

. ./ 

-rw-rw-r-- 

1 

mdbs 

122 

Feb 

25 

12:54 

Makefile 

-rwxrwxr-x 

1 

mdbs 

37152 

Feb 

5 

13:40 

configure 

-rw-rw-r— 

1 

mdbs 

5673 

May 

19 

10:39 

configure 

-rwxr~xr-x 

1 

mdbs 

17948 

Jun 

2 

1992 

constants 

-rwxr-xr-x 

1 

mdbs 

26479 

Jun 

2 

1992 

cpcount* 

-rwxr-xr-x 

1 

mdbs 

84 

Jun 

2 

1992 

cpydisks* 

-rwxrwxr-x 

1 

mdbs 

70533 

Jul 

2 

15:23 

disp* 

-rwxrwxr-x 

1 

mdbs 

37611 

May 

19 

10:40 

main* 

-rw-rw-r— 

1 

mdbs 

33550 

May 

19 

10:35 

main.c 

-rw-rw-r-- 

1 

mdbs 

14194 

May 

19 

10:40 

main.o 


64 








-rvixr—r— 

1 mdbs 

932 

Apr 

30 

12:05 

maka.rasu 

-rwxr-xr-x 

1 mdbs 

557 

Apr 

9 

11:42 

makaall* 

-rw-r—r— 

1 mdbs 

€64 

Apr 

9 

12:14 

makafila 

-rwxr-xr-x 

1 mdbs 

26590 

Jun 

2 

1992 

ractag* 

-rv»xr-xr-x 

1 mdbs 

306 

Fab 

5 

13:44 

stop.cmd* 

-rw-rw-r— 

1 mdbs 

1984 

Apr 

30 

12:08 

ft 

1 

ft 

X 

ft 

-notr—r— 

1 mdbs 

26439 

Jun 

2 

1992 

zaro* 


G. THE **Tun** DIRECTORY 

This is the directwy from which MDBS is nonnally run. It contains the executables 
and scripts necessary to control the orderiy generation of the syston. The “main” 
executable is the program which controls all of the others. It calls the zero command in each 
backend using the zero.db* scripts. It calls run.be on each backend to start the backend 
processes. The master.run.be.file is a master copy of the files located on the backend 
machines. Main also calls startcntrl to start the controller processes. Ultimately, it calls the 
stop.db* scripts to stop the processes when shutdown is signalled. This directory also 
contains numerous temporary files created by MDBS. The *.dbl files are database listing 
files. Each contains the names of the databases that exist for each model. The *dbs.dat 
(database data) files hold schema information about each database. The information in the 
*dbs.dat files are used to generate a catalog flle C^'.cat) for each database. The .qry.file and 
.TransFile temporary files hold information about the latest queny transactions. The 
.config.db file stores information (/.«., machine host names) for the most recent MDBS 
configuration. The “trace” subdirectory contains trace (*.tr) files generated on the latest run 
of MDBS. Each trace file contains the ouqiut from the controller process of the same name. 
The backend process trace files may be found on each backend nuichine. 

greg/run: 
total 243 


drwxr-xr-x 

7 

mdbs 

1536 Jul 

6 

13:07 

./ 

drwxr-xr-x 

7 

mdbs 

1536 May 

10 

19:42 

../ 

-rv-rw-r— 

1 

mdbs 

266 i^r 

13 

13:31 

.COURSE.cat 

-rw-rw-r— 

1 

mdbs 

270 Af>z 

9 

12:13 

.DTH.cat 

-rw-rw-r— 

1 

mdbs 

156 ;^r 

9 

12:13 

.SQD.cat 

-rw-r—r— 

1 

mdbs 

353 J^r 

9 

12:13 

.Syntax 

-rw-rw-r— 

1 

mdbs 

929 Apr 23 

11:28 

.TransFile 

-rw-rw-r— 

1 

mdbs 

9 Apr 

26 

15:27 

.config.db 

-rw-r—r— 

1 

mdbs 

10 Apr 

9 

12:13 

.curr_file 


65 









-rwxr-xr-x 

1 

ndbs 

0 

Apr 

9 

12:13 

.dapdbs.dat* 

-rw- 

1 

mdbs 

19 

Apr 

9 

12:13 

.axa.awk 

-rw-rw-r— 

1 

mdbs 

4 

Apr 

9 

12:13 

.hia.dbl 

-rw-rw-r— 

1 

akibs 

154 

Apr 

9 

12:13 

.hiadbs.dat 

-rw-rw-r— 

1 

mdbs 

13 

Apr 

9 

12:13 

. nat. dbl 

-rw-r—r— 

1 

flidba 

726 

Apr 

9 

12:13 

.natdbs.dat 

-rw-rw-r— 

1 

Bidbs 

0 

Apr 

23 

11:28 

.output 

-rw-rw-r— 

1 

ndbs 

728 

Apr 

23 

11:29 


-rw-rw-r— 

1 

mdbs 

698 

Apr 

9 

12:13 

.raldbs.dat 

-rw-rw-r— 

1 

mdbs 

16 

Apr 

13 

13:31 

.sql.dbl 

-rwxrwxr-x 

1 

mdbs 

37611 

May 

19 

10:40 

Biain* 

-rwxr—r— 

1 

mdbs 

436 

Nov 

6 

1992 

atastar. run. ba* 

-rwxr—r— 

1 

Btdbs 

426 

Mar 

4 

13:37 

start.cntrl* 

-rwxr-xr-x 

1 

mdbs 

76 

Fab 

23 

12:12 

stop.dbl* 

-irvixr-xr-x 

1 

mdbs 

76 

Fab 

23 

12:12 

stop.db2 * 

-rwxr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.db3* 

-rwxr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.db4* 

-rwxr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.dbS* 

-rwxr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.db6* 

-rwxr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.db7• 

-rwxr-xr-x 

1 

mdbs 

314 

Mar 

4 

13:59 

stop.db£* 

-rvocr-xr-x 

1 

mdbs 

76 

Nov 

6 

1992 

stop.db9* 

drwxrwxr-x 

2 

mdbs 

512 

Jul 

15 

10:04 

traca/ 

-rwxr—r— 

1 

mdbs 

230 

Fab 

23 

13:31 

zaro.dbl* 

-rwxr—r— 

1 

mdbs 

230 

Fab 

23 

13:31 

zaro.db2* 

-rwxr--r— 

1 

mdbs 

230 

Fab 

23 

13:31 

zaro.dbl* 

-rwxr—r— 

1 

mdbs 

230 

Fab 

23 

13:31 

zaro.db4* 

-rwxr—r— 

1 

mdbs 

309 

Fab 

5 

13:30 

zaro.dbS* 

-rwxr—r— 

1 

mdbs 

279 

Fab 

23 

13:32 

zaro.db6* 

-rwxr--r— 

1 

mdbs 

230 

Fab 

23 

13:32 

zaro.db7* 

-rwxr—r— 

1 

mdbs 

719 

Apr 

7 

11:17 

zaro.db8* 

-rwxr—r— 

1 

mdbs 

279 

Fab 

23 

13:32 

zaro.db9* 


H. THE **version/COMMON*’ DIRECTORY 

The COMMON directory under the version directory contains source code common 
to both controller and backend processes. Much of the code dealing with inter-process and 
inter-machine conununications is located here. Numerous hardware and network specific 
definitions are also contained in the header files located here. 

greg/CQMMON: 
total 241 


drwxr-xr-x 

2 

mdbs 

1024 Apr 23 

09:20 

./ 

drwxr-xr-x 

7 

mdbs 

1536 May 

10 

19:42 

../ 

-rw-r—r— 

1 

mdbs 

16679 Jun 

2 

1992 

ack.c 

-rw-r—r— 

1 

mdbs 

1279 Jun 

2 

1992 

ack.dcl 

-rw-r—r— 

1 

mdbs 

320 Jun 

2 

1992 

ack.def 

-rw-r—r— 

1 

Btdbs 

48 Jun 

2 

1992 

bano.dcl 


66 





-rw-r—r-- 

1 

mdbs 

69 

Jun 

2 

1992 

beno.dsf 

-rw-r--r— 

1 

mdbs 

8661 

Jun 

2 

1992 

cb.c 

-rw-r—r-- 

1 

mdbs 

6706 

Jun 

2 

1992 

comio.c 

-rw-r—r— 

1 

mdbs 

13788 

Jun 

2 

1992 

coRsndata. def 

-rw-r--r-- 

1 

mdbs 

615 

Jun 

2 

1992 

ccannsg.c 

-rw-r—r-- 

1 

mdbs 

548 

Jun 

2 

1992 dblgcnsral.c 

-rw-r--r-- 

1 

mdbs 

7774 

Jun 

2 

1992 

dbtiqpmod. c 

-rw-r—r— 

1 

mdbs 

3836 

Jun 

2 

1992 

srrormsg.c 

-rw-r—r— 

1 

mdbs 

9360 

Jun 

2 

1992 

gsnsrals.c 

-rw-r—r— 

1 

mdbs 

1133 

Jun 

2 

1992 

mssnd.c 

-rw-r—r— 

1 

mdbs 

63 

Jun 

2 

1992 

msg.dcl 

-rw-r—r— 

1 

mdbs 

6958 

Jun 

2 

1992 

msg.dsf 

-rw-r—r— 

1 

mdbs 

156 

Jun 

2 

1992 

msg.sxt 

-rw-r—r— 

1 

mdbs 

555 

Jun 

2 

1992 

nswdb.c 

-rw-r--r-- 

1 

mdbs 

1694 

Jun 

2 

1992 

nswtmpl.c 

-rw-r—r-- 

1 

mdbs 

28116 

Jun 

2 

1992 

pcl.c 

-rw-r--r— 

1 

mdbs 

899 

Apr 

2 

11:35 pcl.dsf 

-rw-r--r— 

1 

mdbs 

590 

Jun 

2 

1992 

sslsct.c 

-rw-r--r-- 

1 

mdbs 

369 

Jun 

2 

1992 

setbano.c 

-rw-r—r— 

1 

mdbs 

350 

Jun 

2 

1992 

sstnobas.c 

-rw-r--r-- 

1 

mdbs 

27857 

Jun 

14 

1992 

sndrcv.c 

-rw-r—r— 

1 

mdbs 

1734 

Jun 

2 

1992 

sndrcv.dcl 

-rw-r—r— 

1 

mdbs 

296 

Jun 

2 

1992 

sndrcv.def 

-rw-r--r— 

1 

mdbs 

946 

Jun 

2 

1992 

sndrcv.ext 

-rw-r—r-- 

1 

mdbs 

173 

Jun 

2 

1992 

tnqpl .del 

-rw-r—r— 

1 

mdbs 

804 

Jun 

2 

1992 

tmpl.def 

-rw-r--r-- 

1 

mdbs 

188 

Jun 

2 

1992 

tnqpl.ext 

-rw-r--r— 

1 

mdbs 

7569 

Jun 

2 

1992 

utilities.c 

-rw-r—r-- 

1 

mdbs 

1592 

Jun 

2 

1992 

waitmsg.c 


L THE “BE” DRECTORY 

The BE directory is the top-level directory for all of the backend source code and 

executables. Under the current approach, all code (including backend code) is compiled on 

the controller and then copied to its appropriate destination. A copy of the backend 

executables are kept in this directory. The six subdirectories under this directory hold 

source code, object code, and makefiles for each backend process. 

greg/BE: 
total 1864 


drwxr-xr-x 

8 

mdbs 

512 Jun 

2 

12:50 

./ 

drwxr-xr-x 

7 

mdbs 

1536 May 

10 

19:42 

../ 

drwxr-xr-x 

3 

mdbs 

512 Jun 

2 

11:58 

BCOM/ 

drwxr-xr-x 

3 

mdbs 

512 Jun 

2 

11:47 

CC/ 

drwxr-xr-x 

3 

mdbs 

512 Oct 

25 

1992 

COMMON/ 

drwxr-xr-x 

3 

mdbs 

512 Jun 

2 

11:49 

DIO/ 

drwxr-xr-x 

3 

mdbs 

1024 Jun 

2 

11:51 

DM/ 








drwxr-xr-x 

3 ndbs 

1024 

Jun 

2 

12:00 

RECP/ 

-rwxrwxr-x 

1 mdbs 

91847 

Jul 

6 

13:01 

bget.exe* 

-rwxrwxr-x 

1 mdbs 

91847 

Jul 

6 

13:01 

tqput .exe* 

-rwxrwxr-x 

1 mdbs 

104564 

Jul 

2 

14:37 

cc.exe* 

-rwxrwxr-x 

1 mdbs 

55518 

Jul 

2 

14:36 

dio.exe* 

-rwxrwxr-x 

1 mdbs 

126485 

Jul 

2 

15:23 

dirman.exe 

-rwxrwxr-x 

1 mdbs 

150366 

Jul 

2 

14:50 

recp.exe* 


J. THE **BE/COMMON** DIRECTORY 

This directorv holds code which is shared by two or more backend processes. Its 

“Object” subdirectory is used to store a significant amount of object code used by backend 

processes. 

gr*g/6E/C0MM0N: 
total 7 


drwxr-xr-x 

3 mdbs 

512 

Oct 

25 

1992 

./ 

drwxr-xr-x 

8 mdbs 

512 

Jun 

2 

12:50 

../ 

drwxr-xr-x 

2 mdbs 

512 

Jun 

2 

11:59 

Object/ 

-rw-r—r— 

1 mdbs 

3349 

Jun 

2 

1992 

tmplsr.c 


K. THE “BE/BCOM” DIRECTORY 

This directory contains the source code needed to compile the backend get (BGET) 

and put (BPUT) processes. Rags.def contains flags whose setting determine what, if any. 

debugging code will be compiled into the executable files. The mk* script can be used to 

recompile just the code in this subdirectory. System-wide recompilation is controlled from 

the “version/bin” directory. The make_result file contains the compiler’s comments (errors, 

warnings, etc.) about the most recent compilation. The “Object” subdirectory contains the 

object code generated from this source code as well as the itiakefile and shell scripts for 

copying executables to their proper location. 

greg/BE/BCOM: 
total 14 


drwxr-xr-x 

3 mdbs 

512 

Jun 

2 

11:58 

./ 

drwxr-xr-x 

8 mdbs 

512 

Jun 

2 

12:50 

../ 

drwxr-xr-x 

2 mdbs 

512 

Jul 

6 

13:01 

Object/ 

-rw-r—r— 

1 mdbs 

3281 

Jun 

2 

1992 

bget.c 

-rw-r—r— 

1 mdbs 

1612 

Jun 

2 

1992 

bput.c 

-rw-r—r— 

1 mdbs 

33 

Jun 

2 

1992 

dblocal.de£ 

-rw-r—r— 

1 mdbs 

841 

Jul 

6 

12:59 

flags.def 

-rw-rw-r— 

1 mdbs 

394 

Jul 

6 

13:01 

make.result 


68 




-rvotr--r— 1 ndbs 106 Jun 2 1992 ink* 


L. THE ‘«E/CC* DIRECTORY 

This directory contains the source code needed to make the executable for the 
concurrency control (CC) process. The makefiles and shell scripts function like those in the 
“BCOM” directory. 

gr«g/BE/CC: 
totAl 202 


drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:47 

./ 

drwxr-xr-x 

8 

mdbs 

512 

Jun 

2 

12:50 

../ 

-rw-r—r— 

1 

mdbs 

504 

Jun 

2 

1992 

.fixed 

drwxr-xr-x 

2 

mdbs 

1024 

Jul 

2 

14:38 

Object/ 

-rw-r—r— 

1 

mdbs 

21349 

Jun 

2 

1992 

atut.c 

-rw-r--r— 

1 

mdbs 

371 

Jun 

2 

1992 

CC.del 

-rw-r—r-- 

1 

mdbs 

6719 

Jun 

2 

1992 

cc.def 

-rw-r—r— 

1 

mdbs 

443 

Jun 

2 

1992 

CC.ext 

-rw-r--r-- 

1 

mdbs 

16851 

Jun 

2 

1992 

cccs.c 

-rw-r—r— 

1 

mdbs 

17883 

Jun 

2 

1992 

ceds.c 

-rw-r--r-- 

1 

mdbs 

10408 

Jun 

2 

1992 

ccmain.c 

-rw-r—r-- 

1 

mdbs 

16869 

Jun 

2 

1992 

cerp.c 

-rw-r--r-- 

1 

mdbs 

13063 

Jun 

2 

1992 

cesr.c 

-rw-r—r-- 

1 

mdbs 

680 

Jun 

2 

1992 

cinit.c 

-rw-r—r— 

1 

mdbs 

14478 

Jun 

2 

1992 

ctut.c 

-rw-r—r-- 

1 

mdbs 

31 

Jun 

2 

1992 

dblocal.def 

-rw-r--r— 

1 

mdbs 

824 

Jul 

2 

14:32 

flags.def 

-rw-rw-r-- 

1 

mdbs 

848 

Jul 

2 

14:37 

make.result 

-rw-r—r— 

1 

mdbs 

7214 

Jun 

2 

1992 

mallocs.c 

-rwxr--r— 

1 

mdbs 

107 

Jun 

2 

1992 

mk* 

-rw-r—r-- 

1 

mdbs 

24214 

Jun 

2 

1992 

tuat.c 

-rw-r—r-- 

1 

mdbs 

15258 

Jun 

2 

1992 

tuct.c 

-rw-r--r-- 

1 

mdbs 

21221 

Jun 

2 

1992 

tudist.c 

-rw-r—r— 

1 

mdbs 

1156 

Jun 

2 

1992 

unixeinit.c 

-rw-r--r— 

1 

mdbs 

979 

Jun 

2 

1992 

update.c 


M. THE “BE/DIO” DIRECTORY. 


This directory holds the source code for dio.exe, the basis of the record disk input- 
output (DIO) process). 

gr«g/BE/DIO: 
total 36 

drwxr-xr-x 3 mdbs 
drvocr-xr-x 8 mdbs 
drvixr-xr-x 2 mdbs 
-rw-r—r— 1 mdbs 


512 Jun 
512 Jun 
512 Jul 
1610 Jun 


2 11:49 ./ 

2 12:50 ../ 

2 14:36 Object/ 

2 1992 cpcount.c 


69 






-rw-r--r— 

1 

mdbs 

33 

Jun 

2 

1992 

dblocal.def 

-rw-r—r-- 

1 

Bdba 

15993 

Jun 

2 

1992 

dio.c 

-rw-r—r— 

1 

indbs 

2102 

Jun 

2 

1992 

dio.h 

-rw-r—r— 

1 

mdbs 

828 

Jul 

2 

14:34 

flags.dsf 

-rw-rw-r— 

1 

mdbs 

183 

Jul 

2 

14:36 

make_rssult 

-rwxr—r— 

1 

mdbs 

112 

Jun 

2 

1992 

mk* 

-tw-r—r— 

1 

mdbs 

4994 

Jun 

2 

1992 

rsctag.c 

-rw-r—r— 

1 

mdbs 

1792 

Jun 

2 

1992 

zsro.c 


N. THE ‘‘BE/DM” DIRECTORY 

This directory contains the source code for the directory management (DM) process. 

gr«g/BE/CM: 
total 289 


drwxr-xr-x 

3 

mdbs 

1024 

Jun 

2 

11:51 

./ 

drwxr-xr-x 

8 

mdbs 

512 

Jun 

2 

12:50 

../ 

drwxr-xr-x 

2 

mdbs 

1024 

Jul 

2 

15:23 

Object/ 

-rw-r--r-- 

1 

mdbs 

8329 

Jun 

2 

1992 

ag.c 

-rw-r--r— 

1 

mdbs 

7712 

Jun 

2 

1992 

atm.c 

-rw-r—r— 

1 

mdbs 

5455 

Jun 

2 

1992 

beno.c 

-rw-r--r-- 

1 

mdbs 

4493 

Jun 

2 

1992 

cdtmbe.c 

-rw-r--r-- 

1 

mdbs 

1055 

Jun 

2 

1992 

cdtmm.c 

-rw-r—r-- 

1 

mdbs 

9648 

Jun 

2 

1992 

common.c 

-rw-r—r— 

1 

mdbs 

1688 

Jun 

2 

1992 

constants.c 

-rw-r—r— 

1 

mdbs 

14980 

Jun 

2 

1992 

cs.c 

-rw-r—r— 

1 

mdbs 

4503 

Jun 

2 

1992 

cslresta.c 

-rw-r—r— 

1 

mdbs 

5999 

Jun 

2 

1992 

cs3rest.c 

-rw-r—r— 

1 

mdbs 

1562 

Jun 

2 

1992 

dbinit.c 

-rw-r—r— 

1 

mdbs 

31 

Jun 

2 

1992 

dblocal.def 

-rw-r—r-- 

1 

mdbs 

2369 

Jun 

2 

1992 

ddit.c 

-rw-r—r— 

1 

mdbs 

5262 

Jun 

2 

1992 

desc.c 

-rw-r—r— 

1 

mdbs 

14576 

Jun 

2 

1992 

didef.c 

-rw-r--r— 

1 

mdbs 

19862 

Jun 

2 

1992 

dirman.c 

-rw-r—r— 

1 

mdbs 

407 

Jun 

2 

1992 

dirman.dcl 

-rw-r'—r-- 

1 

mdbs 

5062 

Jun 

2 

1992 

dirman.def 

-rw-r—r— 

1 

mdbs 

421 

Jun 

2 

1992 

dirman.ext 

-rw-r—r— 

1 

mdbs 

7179 

Jun 

2 

1992 

disp.c 

-rw-r—r— 

1 

mdbs 

3247 

Jun 

2 

1992 

dmfree.c 

-rw-r—r— 

1 

mdbs 

1024 

Jun 

2 

1992 

dmnomore.c 

-rw-r—r— 

1 

mdbs 

13480 

Jun 

2 

1992 

dmsr.c 

-rw-r—r— 

1 

mdbs 

904 

Jun 

2 

1992 

dmupdfin.c 

-rw-r—r— 

1 

mdbs 

10234 

Jun 

2 

1992 

ds.c 

-rw-r—r— 

1 

mdbs 

3790 

Jun 

2 

1992 

dsdone.c 

-rw-r—r— 

1 

mdbs 

825 

Jul 

2 

15:17 

flags.def 

-rw-rw-r— 

1 

mdbs 

1491 

Jul 

2 

15:23 

make_result 

-rw-r—r— 

1 

mdbs 

5703 

Jun 

2 

1992 

mallocs.c 

-rw-r--r— 

1 

mdbs 

45200 

Jun 

2 

1992 

meta.c 

-rw-r—r— 

1 

mdbs 

9750 

Jun 

2 

1992 

meta.def 

-rwxr—r— 

1 

mdbs 

111 

Jun 

2 

1992 

mk* 


70 




-rw-r—r— 

1 

mdbs 

6392 

Jun 

2 

1992 

newdesc.c 

-rw-r—r— 

1 

mdbs 

7425 

Jun 

2 

1992 

oldnew.c 

-rw-r—r— 

1 

aidbs 

16676 

Jun 

2 

1992 

rdtsavs.c 

-rw-r--r— 

1 

mdbs 

1300 

Jun 

2 

1992 

rdtsort.c 

-rw-r—r— 

1 

mdbs 

4513 

Jun 

2 

1992 

tablsdump.c 

-rw-r--r— 

1 

mdbs 

16447 

Jun 

2 

1992 

tu.c 


O. THE ‘‘BE/RECP* DIRECTORY 

This directory contains source code for the record processing (RECP) process. 

gr*g/BE/RECP: 
total 250 


drwxr-xr-x 

3 

mdbs 

1024 

Jun 

2 

12:00 

./ 

drwxr-xr-x 

8 

mdbs 

512 

Jun 

2 

12:50 


drwxr-xr-x 

2 

mdbs 

1024 

Jul 

2 

14:50 

Object/ 

-rw-r--r-- 

1 

mdbs 

9528 

Jun 

2 

1992 

allsto.c 

-rw-r—r— 

1 

mdbs 

48 

Jun 

2 

1992 

beno.dcl 

-rw-r--r-- 

1 

mdbs 

5291 

Jun 

2 

1992 

chkqry-c 

-rw-r--r-- 

1 

mdbs 

978 

Jun 

2 

1992 

chkwait.c 

-rw-r--r— 

1 

mdbs 

152 

Jun 

2 

1992 

dblocal.de£ 

-rw-r—r-- 

1 

mdbs 

1554 

Jun 

2 

1992 

delp.c 

-rw-r—r— 

1 

mdbs 

9466 

Jun 

2 

1992 

disks.c 

-rw-r--r— 

1 

mdbs 

145 

Jun 

2 

1992 

disks.del 

-rw-r--r— 

1 

mdbs 

773 

Jun 

2 

1992 

disks.def 

-rw-r—r— 

1 

mdbs 

63 

Jun 

2 

1992 

disks.ext 

-rw-r—r— 

1 

mdbs 

1463 

Jun 

2 

1992 

findrp.c 

-rw-r—r— 

1 

mdbs 

865 

Jul 

2 

14:39 

flags.def 

-rw-r—r-- 

1 

mdbs 

3135 

Jun 

2 

1992 

insp.c 

-rw-rw-r— 

1 

mdbs 

1163 

Jul 

2 

14:50 

make_result 

-rw-r--r-- 

1 

mdbs 

2219 

Jun 

2 

1992 

roallocs.c 

-rwxr—r-- 

1 

mdbs 

111 

Jun 

2 

1992 

mk* 

-rw-r—r— 

1 

mdbs 

1132 

Jun 

2 

1992 

nomore.c 

-rw-r—r— 

1 

mdbs 

6632 

Jun 

2 

1992 

rbabs.c 

-rw-r—r— 

1 

mdbs 

6006 

Jun 

2 

1992 

rcreqs.c 

-rw-r--r— 

1 

mdbs 

19954 

Jun 

2 

1992 

recproc.c 

-rw-r—r-- 

1 

mdbs 

257 

Jun 

2 

1992 

recproc.dcl 

-rw-r—r-- 

1 

mdbs 

4875 

Jun 

2 

1992 

recproc.def 

-rw-r—r— 

1 

mdbs 

283 

Jun 

2 

1992 

recproc.ext 

-rw-r—r— 

1 

mdbs 

19690 

Jun 

2 

1992 

reepsr.c 

-rw-r—r— 

1 

mdbs 

10332 

Jun 

2 

1992 

retby.c 

-rw-r--r— 

1 

mdbs 

25912 

Jun 

2 

1992 

retcom.c 

-rw-r—r— 

1 

mdbs 

12334 

Jun 

2 

1992 

retp.c 

-rw-r—r— 

1 

mdbs 

1154 

Jun 

2 

1992 

rpcont.c 

-rw-r—r— 

1 

mdbs 

3377 

Jun 

2 

1992 

rpfree.c 

-rw-r—r— 

1 

mdbs 

4553 

Jun 

2 

1992 

streqs.c 

-rw-r—r— 

1 

mdbs 

1165 

Jun 

2 

1992 

unixdisks.c 

-rw-r—r— 

1 

mdbs 

433 

Jun 

2 

1992 

unixdisks.def 

-rw-r—r— 

1 

mdbs 

20414 

Jun 

2 

1992 

updp.c 

-rw-r—r— 

1 

mdbs 

4693 

Jun 

2 

1992 

wereqs.c 


71 







P. THE “verston/CNTRL” DIRECTORY 


This is the top-level directory for controller source and object code. It contains the 

executables fcveach of the six controller processes (CGET, CPUT, IIG, PP, REQP, and TI). 

These executables are copied here by shell scripts after being created in the subdirectories 

bearing their names. 

gr«g/CNTRL: 
total 1784 


drwxr-xr-x 

9 

mdbs 

512 

Jun 

2 

11:46 

./ 

drwxr-xr-x 

7 

mdbs 

1536 

May 

10 

19:42 

. ./ 

drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:34 

CCOM/ 

drwxr-xr-x 

3 

mdbs 

512 

Oct 

25 

1992 

COMMON/ 

drwxr-xr-x 

3 

mdbs 

512 

Jun 

20 

2:25 

IIG/ 

drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:36 

PP/ 

drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:38 

REQP/ 

drwxr-xr-x 

4 

mdbs 

1024 

Jun 

2 

11:40 

TI/ 

-rwxrwxr-x 

1 

mdbs 

91811 

Jun 

20 

13:09 

cget.exe 

-rwxrwxr-x 

1 

mdbs 

91811 

Jun 

20 

13:09 

cput.exe 

-rwxrwxr-x 

1 

mdbs 

56680 

Jun 

2 

12:48 

iig.exe* 

-rwxrwxr-x 

1 

mdbs 

56971 

Jun 

2 

12:48 

pp.exe* 

-rwxrwxr-x 

1 

mdbs 

84382 

Jun 

2 

12:48 

reqp.exe 

-rwxrwxr-x 

1 

mdbs 

496436 

Jun 

7 

10:11 

ti.exe* 


Q. THE “CNTRL/COMMON” DIRECTORY 

This directory contains source code common to two or more controller processes. The 
“Object” subdirectory holds a large quantity of object code for controller processes used by 
the linker. 

greg/CNTRL/COMMON: 
total 7 

drwxr-xr-x 3 mdbs 
drwxr-xr-x 9 mdbs 
drwxr-xr-x 2 mdbs 
-rw-r—r— 1 mdbs 


512 Oct 25 1992 ./ 

512 Jun 2 11:46 ../ 

512 Jun 20 13:09 Object/ 
3209 Jun 2 1992 tmplsr.c 


R. THE «CNTRL/CCOM” DIRECTORY 

This directory holds the source code needed to create the executable files for the 
controller’s get (CGET) and put (Q*UT) processes. The function of the makefile and shell 
scripts are identical to those discussed for the “BE/BCX)M” directory above. 


72 





gr«g/CNTRL/CCOM; 
total 13 


drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:34 

. / 

drwxr-xr-x 

9 

mdbs 

512 

Jun 

2 

11:46 

. . / 

drwxr-xr-x 

2 

mdbs 

512 

Jun 

20 

13:09 

Object/ 

-rw-r--r-- 

1 

mdbs 

2059 

Jun 

2 

1992 

cget.c 

-rw-r--r-- 

1 

mdbs 

1225 

Jun 

2 

1992 

cput.c 

-rw-r--r-- 

1 

mdbs 

33 

Jun 

2 

1992 

dblocal.def 

-rw-r--r— 

1 

mdbs 

830 

Jun 

20 

13:06 

flags.def 

-rw-rw-r-- 

1 

mdbs 

556 

Jun 

20 

13:09 

make_result 

-rwxr--r-- 

1 

mdbs 

106 

Jun 

2 

1992 

mk* 


S. THE “CNTRL/IIG” DIRECTORY 

This directoiy contains the source code for the inseit-information-generator (IIG) 
processes. 

greg/CNTRL/IIG: 
total 49 


drwxr-xr-x 

3 

mdbs 

512 

Jun 

20 

12:25 

./ 

drwxr-xr-x 

9 

mdbs 

512 

Jur 

2 

11:46 

../ 

drwxr-xr-x 

2 

mdbs 

512 

Jun 

2 

11:34 

Object/ 

-rw-r--r-- 

1 

mdbs 

11286 

Jun 

2 

1992 

bes.c 

-rw-r--r-- 

1 

mdbs 

498 

Jun 

2 

1992 

dblocal.def 

-rw-r--r-- 

1 

mdbs 

4287 

Jun 

2 

1992 

didgen.c 

-rw-r--r-- 

1 

mdbs 

827 

Jun 

2 

11:29 

flags.def 

-rw-r--r-- 

1 

mdbs 

11255 

Jun 

2 

1992 

iig.c 

-rw-r--r-- 

1 

mdbs 

244 

Jun 

2 

1992 

iig.dcl 

-rw-r--r-- 

i 

mdbs 

1295 

Jun 

2 

1992 

iig.def 

-rw-r--r-- 

1 

mdbs 

236 

Jun 

2 

1992 

iig.ext 

-rw-r--r-- 

1 

mdbs 

2931 

Jun 

2 

1992 

iigdbl.c 

-rw-r—r-- 

1 

mdbs 

5936 

Jun 

2 

1992 

iigsr.c 

-rw-rw-r-- 

1 

mdbs 

833 

Jun 

2 

11:34 

make_result 

-rwxr--r-- 

1 

mdbs 

118 

Jun 

2 

1992 

mk* 


T. THE “CNTRL/PP’ DIRECTORY 

This directory contains the source code for the program supporting the post¬ 
processing (PP) process. 

greg/CNTRL/PP; 
total 56 


drwxr-xr-x 

3 

mdbs 

512 

Jun 

2 

11:36 

./ 

drwxr-xr-x 

9 

mdbs 

512 

Jun 

2 

11:46 

. ./ 

drwxr-xr-x 

2 

mdbs 

512 

Jun 

2 

11:38 

Object/ 

-rw-r--r— 

1 

mdbs 

137 

Jun 

2 

1992 

dblocal.def 

-rw-r--r— 

1 

mdbs 

827 

Jun 

2 

11:29 

flags.def 

-rw-rvv-r— 

1 

mdbs 

654 

Jun 

2 

11:37 

make_result 





-rwxr—r— 

1 

indbs 

117 

Jun 

2 

1992 

mk* 

-rw-r--r-- 

1 

mdbs 

8486 

Jun 

2 

1992 

pp.c 

-rw-r--r-- 

1 

mdbs 

157 

Jun 

2 

1992 

pp.dcl 

-rw-r--r-- 

1 

mdbs 

1379 

Jun 

2 

1992 

pp.def 

-rv;-r--r-- 

1 

mdbs 

116 

Jun 

2 

1992 

pp.ext 

-rw-r—r— 

1 

mdbs 

7201 

Jun 

2 

1992 

ppby.c 

-rw-r—r— 

1 

mdbs 

8195 

Jun 

2 

1992 

pprba.c 

-rw-r—r— 

1 

mdbs 

5582 

Jun 

2 

1992 

ppsr.c 

-rw-r—r— 

1 

mdbs 

11447 

Jun 

2 

1992 

repmon.c 


U. THE “CNTRL/REQP’ DKECTORY 

This directory holds the source code for the request processing (REQP) process. 


greg/CNTRL/REQP: 
total 119 

drwxr-xr-x 3 mdbs 

512 

Jun 

2 

11:38 

./ 

drwxr-xr-x 

9 

mdbs 

512 

Jun 

2 

11:46 

.. / 

drwxr-xr-x 

2 

mdbs 

1024 

Jun 

2 

11:40 

Object/ 

-rw-r--r-- 

1 

mdbs 

13546 

Jun 

2 

1992 

chkptu.c 

-rw-r--r— 

1 

mdbs 

35 

Jun 

2 

1992 

dblocal.def 

-rw-r--r-- 

1 

mdbs 

829 

Jun 

2 

11:30 

flags.def 

-rw-r--r-- 

1 

mdbs 

823 

Oct 

4 

1992 

flags.def.nli 

-rw-r--r-- 

1 

mdbs 

4218 

Jun 

2 

1992 

Isrc 

-rw-rw-r-- 

1 

mdbs 

889 

Jun 

2 

11:40 

make_result 

-rw-r--r-- 

1 

mdbs 

2989 

Jun 

2 

1992 

mallocs.c 

-rwxr--r-- 

1 

mdbs 

107 

Jun 

2 

1992 

mk* 

-rw-r--r-- 

1 

mdbs 

14580 

Jun 

2 

1992 

reqcomp.c 

-rw-r--r-- 

1 

mdbs 

18334 

May 

12 

08:38 

reqp.c 

-rw-r--r— 

1 

mdbs 

125 

Jun 

2 

1992 

reqp.dcl 

-rw-r--r-- 

1 

mdbs 

677 

Jun 

2 

1992 

reqp.def 

-rw-r—r-- 

1 

mdbs 

91 

Jun 

2 

1992 

reqp.ext 

-rw-r--r-- 

1 

mdbs 

13270 

Jun 

2 

1992 

reqpsr.c 

-rw-r--r-- 

1 

mdbs 

40029 

Jun 

2 

1992 

ysrc 


V. THE “CNTRL/TI” DIRECTORY 

This directory, and the many subdirectories beneath it, hold the source code for the test 
interface (TI) process. The TI process directly interacts with the user by receiving terminal 
instructions and displaying results. This directory contains the code supponing the kernel 
data language. The directories beneath this one (under the LangIF subdirectory) contain the 
code supporting the other model-language interfaces (hierarchical, network, relational, 
object-oriented, and functional). It is important to observe that all of the source code in this 
directory and every directory beneath it is a part of the same TI process. 


74 







gr«g/CNTRL/Tl: 
total 207 


drwxr-xr-x 

4 

mdbs 

1024 

Jun 

2 

11:40 

./ 

drwxr-xr-x 

9 

mdbs 

512 

Jun 

2 

11:46 

. ./ 

drwxr-xr-x 

5 

mdbs 

512 

Mar 

5 

12:39 

LangIF/ 

drwxr-xr-x 

2 

mdbs 

1024 

Jun 

7 

10:11 

Object/ 

-rw-r—r— 

1 

mdbs 

6555 

Mar 

5 

12:49 

dbl.c 

-rw-r--r— 

1 

mdbs 

155 

Jun 

2 

1992 

dblocal.def 

-rw-r—r— 

1 

mdbs 

6052 

Jun 

2 

1992 

dblsr.c 

-rw-r—r-- 

1 

mdbs 

833 

Jun 

7 

09:58 

flags.def 

-rw-r--r— 

1 

mdbs 

7384 

Jun 

2 

1992 

gdb.c 

-rw-r--r-- 

1 

mdbs 

7149 

Jun 

2 

1992 

gsdesc.c 

-rw-r--r— 

1 

mdbs 

6214 

Jun 

2 

1992 

gsgenrec.c 

-rw-r—r-- 

1 

mdbs 

4613 

Jun 

2 

1992 

gsgmset.c 

-rw-r--r— 

1 

mdbs 

4682 

Jun 

2 

1992 

gsmodset.c 

-rw-r—r— 

1 

mdbs 

2948 

Jun 

2 

1992 

gstirqpl .c 

-rw-r--r-- 

1 

mdbs 

31537 

Jun 

2 

1992 

intest.c 

-rw-rw-r-- 

1 

mdbs 

1587 

Jun 

7 

10:11 

make_result 

-rwxr--r-- 

1 

mdbs 

204 

Sep 

30 

1992 

mk* 

-rw-r--r-- 

1 

mdbs 

4729 

Apr 

2 

15:16 

ti .c 

-rw-r--r-- 

1 

mdbs 

9741 

Jun 

2 

1992 

tireqs.c 

-rw-r—r-- 

1 

mdbs 

16120 

Jun 

2 

1992 

tireqsubs.c 

-rw-r--r-- 

1 

mdbs 

9185 

Jun 

2 

1992 

tisr .c 

-rw-r--r-- 

1 

mdbs 

25448 

Jun 

2 

1992 

tisubs.c 

-rw-r--r-- 

1 

mdbs 

20371 

Mar 

5 

13:18 

tstint.c 

-rw-rw-r-- 

1 

mdbs 

21055 

Apr 

16 

15:23 

tstint.c.bak 

-rw-r--r-- 

1 

mdbs 

1049 

Jun 

2 

1992 

tstint.del 

-rw-r--r-- 

1 

mdbs 

1441 

Jun 

2 

1992 

tstint.def 

-rw-r—r-- 

1 

mdbs 

1188 

Jun 

2 

1992 

tstint.ext 

-rw-r—r-- 

1 

mdbs 

1961 

Jun 

2 

1992 

unixtime.c 


W. THE “CNTRLnri/LanglP’ DIRECTORY 

This directory serves as the parent directory for all of the non-kernel data model/ 
language interfaces. All of the code supporting these interfaces can be compiled and linked 
using the makefiles located here. Each model/language interface can also be individually 
compiled and linked in the corresponding lower level subdirectory. The “include" 
subdirectory holds header Ales common to two or more of thi non-kernel model/language 
interface’s code. The “lib” subdirectory is where archival copies of the language interface 
code is stored. The “src” subdirectory leads directly to the code for each model/language 
interface. These lower level directories are not shown here, but each is logically divided 


75 







r 


into four sections (kernel controller, kernel formatting system, kernel mapping system, and 
language interface). 


76 







APPENDIX D. BACKEND DIRECTORY AND FILE INFORMATION 


The following is an annotated listing of the directories and files making up each 
backend in the Multibackend Database System. Note that the compiled C executables are 
normally compiled on the controller and then copied into the proper directories on the 
backend. 

A. THE “/U” ROOT DIRECTORY 

The “/u" root directory holds numerous temporary files created, used, and deleted by 
MDBS. The .pid files are used by MDBS to store process identification numbers generated 
by the operating system on each run. The .alias file is used by researchers but is not utilized 
directly by MDBS. 


drwxrwxr-x 

13 

mdbs 

1024 

Feb 

27 

20:12 

./ 

drwxr-xr-x 

14 

root 

512 

Feb 

18 

1992 

. ./ 

-rw-rw-r-- 

1 

mdbs 

1668 

Jun 

14 

1989 

.alias 

-rw-r—r-- 

1 

mdbs 

4 

Feb 

25 

22:56 

.bget.exe.pid 

-rw-r—r-- 

1 

mdbs 

4 

Feb 

25 

22:56 

.bput.exe.pid 

-rw-r—r— 

1 

mdbs 

4 

Feb 

25 

22:56 

.cc.exe.pid 

-rwxr-xr-x 

1 

mdbs 

223 

Jan 

14 

1989 

.eshre* 

-rw-r--r-- 

1 

mdbs 

4 

Feb 

25 

22:56 

.dio.exe.pid 

-rw-r—r-- 

1 

mdbs 

4 

Feb 

25 

22:56 

.dirman.exe.pid 

-rw-r--r-- 

1 

mdbs 

4 

Feb 

26 

22:56 

.recp.exe.pid 

-rw-r—r-- 

1 

mdbs 

2 

Feb 

4 

14:35 

.rhosts 

-rw-rw-r— 

1 

mdbs 

1189 

Feb 

4 

14:35 

.rhosts.bak 

drwxrwxr-x 

2 

mdbs 

512 

Feb 

25 

22:56 

Sockets/ 

drwxrwxr-x 

2 

mdbs 

512 

Feb 

12 

11:58 

UserFiles/ 

drwxrwxr-x 

2 

mdbs 

512 

Nov 

5 

13:50 

be.greg/ 

drwxrwxr-x 

2 

mdbs 

512 

Feb 

4 

19:49 

bin/ 


B. THE *^kets” DIRECTORY 

The “sockets” direaory (under the root directory) contains the six sockets used for 
interprocess-communication on each backend. These sockets are deleted and created anew 
for each run of MDBS. 

Sockets: 
total 2 

drwxrwxr-x 2 mdbs 512 Feb 25 22:56 ./ 

drwxrvocr-x 13 mdbs 1024 Feb 27 20:12 ../ 


77 







srwxrwxxwx 

1 

mdbs 

0 

Feb 

25 

22:56 

CC= 

srwxrwxrwx 

1 

mdbs 

0 

Feb 

25 

22:56 

DIO= 

sr%iixrwxrwx 

1 

mdbs 

0 

Feb 

25 

22:56 

CMs 

srwxrwxrwx 

1 

mdbs 

0 

Feb 

25 

22:56 

G_PCLB= 

srwxrwxrwx 

1 

mdbs 

0 

Feb 

25 

22:56 

P^PCLB* 

srwxrvixxwx 

1 

mdbs 

0 

Feb 

25 

22:56 

RECP= 


C. THE *UserFilcs** DIRECTORY 


This directory, located directly under the root directory, holds the descriptor (*.d) and 
template (*.t) files for each database {e.g. SALES) used by MDBS. Both of these files must 
be present or the database will not run. The below listing indicates that this backend is 


prepared to support three databases (i.e. COURSE, SALES, and SQD). For details on the 


composition of the descriptor and template files see [Bourgeois, 1993]. 


UserFilcs: 


total 12 

drwxrwxr-x 2 mdbs 
drvQcrwxr-x 13 mdbs 
-rw-r—r-- 1 mdbs 
-rw-r--r— 1 mdbs 
-rw-rw-r-- 1 mdbs 
-rw-rw-r-- 1 mdbs 
-rw-rw-r-- 1 mdbs 
-rw-rw-r-- 1 mdbs 


512 Feb 12 
1024 Feb 27 
51 May 21 
126 May 21 
140 Oct 19 
121 Oct 19 
43 Feb 12 
140 Feb 12 


11:58 ./ 

20:12 ../ 

1992 COURSE.d 
1992 COURSE.t 
1988 SALES.d 
1988 SALES.t 
10:50 SQD.d 
10:50 SQD.t 


D. THE *«be.version” DIRECTORY 

This directory, located under the root directory, is the top-level directory for each 

different version of MDBS software on each backend.. In this case, the working version is 

“greg” (hence “be.greg”). This directory contains the executable files (*.exe) for the six 

backend processes. It also holds the trace files (*.tr). The trace files are text files output by 

the processes of the same names. The run.be script is used by the controller to start the six 

b^kend processes. 

be.greg: 
total 2477 


drwxrwxr-x 

2 

mdbs 

512 

Nov 

5 

13:50 

./ 

drwxrvQcr-x 

13 

mdbs 

1024 

Feb 

27 

20:12 


-rwxr-xr-x 

1 

mdbs 

83649 

Nov 

5 

13:23 

bget.exe 

-rw-r—r— 

1 

mdbs 

0 

Feb 

25 

22:56 

bget.tr 

-rwxr-xr-x 

1 

mdbs 

83649 

Nov 

5 

13:23 

bput.exe 


78 





-rw-r—r— 

1 

ndbs 

60 

Fsb 

25 

22:58 

fcput .tr 

-rvntr-xr-x 

1 

mdba 

246357 

Nov 

5 

13:23 

cc.sxs* 

-rv»-r—r— 

1 

ndbs 

37 

Fsb 

25 

22:58 

cc.tr 

-rvntr-xr-x 

1 

ndbs 

89208 

Nov 

5 

13:23 

dio.sxs* 

-rw-r--r— 

1 

ndbs 

465 

Fsb 

25 

22:58 

dio.tr 

-rvntr-xr-x 

1 

ndbs 

375581 

Nov 

5 

13:23 

diman. sxs 

-rw-r—r-- 

1 

ndbs 

101 

Fsb 

25 

22:58 

diman.tr 

-rw-r—r— 

1 

ndbs 

19 

Nov 

4 

17:10 

sxs.awk 

-rvntr-xr-x 

1 

ndbs 

335559 

Nov 

5 

13:50 

rscp.sxs* 

-rw-r—r— 

1 

ndbs 

37 

Fsb 

25 

22:58 

rscp.tr 

-rwxr—r-- 

1 

ndbs 

431 

Nov 

5 

13:37 

run.bs* 

-rw-r—r— 

1 

ndbs 

160 

Nov 

4 

17:10 

stop.sxs 


E. THE **bin” DIRECTORY 

Located under the greg.be directory, the bin directory holds the utility files used by the 
backend. The most important of these is the zero conunand (zero*) which is used to 
initialize the meta and data disks before each run. The other utilities present here are 
leftover from earlier versions of MDBS. They are no longer required to run MDBS, but 
have been left here because they cai^ sometimes be useful. The stop command (stop.cnxl) 
may be used to stop MDBS processes running on this backend. The .list.stop and exe.awk 
files work with the stop command. The cpcountc file copies a user-specified amount (in 
bytes) of an existing file to a new file specified by the user. The cpydisks script was once 
used to redistribute backend code, 
bin: 

total 39 


drwxrwxr-x 

2 

ndbs 

512 

Feb 

4 

19:49 

./ 

drwxrwxr-x 

13 

ndbs 

1024 

Fsb 

27 

20:12 

. ./ 

-rw-r--r-- 

1 

ndbs 

0 

Jun 

14 

1989 

. list.stop 

-rw-rw-r— 

1 

ndbs 

1610 

Jan 

14 

1989 

cpcount.c 

-rwxrwxr-x 

1 

ndbs 

84 

Jan 

20 

1989 

cpydisks* 

-rw-r—r— 

1 

ndbs 

19 

Nov 

5 

12:52 

exe.awk 

-rvntr-xr-x 

1 

ndbs 

373 

Feb 

5 

13:49 

stop.end* 

-rw-r—r— 

1 

ndbs 

0 

Fsb 

4 

18:50 

stop.trace 

-rwxr—r-- 

1 

ndbs 

26439 

Nov 

5 

12:52 

zero* 


79 







APPENDIX E. DEMONSTRATION DATABASE 


The new AMMO (ABDL) database created for the demonstration of MDBS on the 
new hardware and software platform is listed below. 

A. THE DESCRIPTOR HLE(AMMO.d) 

AMMO 
TEMP b s 
I INFO 
! COUNT 
@ 

DODIC a S 
AOOl J999 
KOOl Z999 


NOMEN a s 
A G 
H R 
S Z 
@ 

QTY a i 
1 100 
100 1000 
@ 

$ 

B. THE TEMPLATE HLE (AMM04) 

AMMO 

2 

3 

INFO 
TEMP s 
DODIC S 
NOMEN s 
3 

COUNT 
TEMP s 


80 






DODIC S 
QTY i 

C THE RECORD HLE (AMMO.r) 

AMMO 

9 

INFO 

D680 Projo 
D681 Projo 
N232 Fuze 
N340 Fuze 
@ 

COUNT 
D680 200 
D681 150 
N232 180 
N340 170 
$ 

D. THE QUERY HLE (AMMOreql) 

(RETRIEVE(TEMP=INFO) (DODIC,NOMEN)BY DODIC)% 
(RETRIEVE(TEMP=COUNT) (DODIC,QTY)BY DODIC)% 

(INSERT(<TEMP,INFO>,<DODIC,Ml3 0>,<NOMEN,PROP>))% 
(DELETE((TEMP=INFO) and (DODIC=M130)))% 


81 






APPENDIX F. NEW MDBS FUNCTIONS 


During this porting project, numerous adjustments and modifications were made to 
the existing code. A few existing functions were almost completely rewritten. New 
constants were also added as needed. Consistent with the goals outlined in Chapter 1, 
though, only two wholly new functions with their associated calls were added to MDBS. 

A THE ^irast.namejntecer** FUNCTION 

This function, located in the ack.c source file, receives a hostname (which may be a 
member of the host.names array) and returns only the number portion as an integer. This 
integer is used by other functions to uniquely identify the workstation. This function is 
called from many locations within ack.c. 

int host_naine_int«ger (host.name) 

/* this function is passed an element of the host_names array 
(e.g.'dbll*) - it returns the number part as an integer. This 
routine replaces the old way of picking the number part of the host 
name.•/ 

char host_naroe[1; 


{ 

char teit 5 )[host_nam€_len + 1]; 
int temp^index = 0; 
int i; 

for(i=0;i<host_name_len;i++){ 

if (isdigit (host_n«une[i]) != 0) { 

ten^[temp_index] = host_nameli]; 
teiip_index-»'4-; 

) 

) 

temp[temp_index] * '\0'; 

return(atoi(temp)); 

} /* end host_naffle_integer() */ 


B. THE ‘init_ineto_NA1T” FUNCTION 

This function, located in the meta.c source file, uses the global definitions of 
nrst_record_cylinder and first_iecord_track (meta.def) to load initial values to the Next- 


82 




Available-Tr»;k-Table (NATT) on the meta-data disk. This function is called from the 
co(te in the dinnan.c source file. 
init_jm«ta_NATT() 

/* Store record disk's starting cylinder/track values to NATT */ 

{ 

unsigned short £ir_rec_cyl * £irst_record_cylinder; 
unsigned char £ir_rec_trk * fir8t_record_track; 
long IseekO; 

#i£de£ EnExFlag 

printf(•Enter init_raeta_NATT\n*); 

££lu8h(stdout); 

#endi£ 

/* seek to NATT area of meta disk */ 

if (IseekCmetafptr. (long) NATT_OFFSET, 0) != NATT_OFFSET) 
SysError (8, *1 init_;neta_NATT* ) ; 

/* write initial valuse */ 

if (write(meta£ptr,&fir_rec_cyl,si 2 eof(fir_rec_cyl)) < 0) 
SysError(12, “2 init_meta_NATT*); 
if (write(metafptr,ifir_rec_trk.sizeof(fir_rec_trk)) < 0) 
SysError(12, * 3 init_meta_NATT*); 

iifdef EnExFlag 

printf ('Exit init_ineta_NATT\n*); 
fflush(stdout); 

#endi£ 

) /* end init_roeta_NATT */ 


83 







LIST OF REFERENCES 


Bourgeois, Paul A., ‘The Instrumentation of the MultinfKxlel and Multilingual User 
Interface,” M. S. Thesis, Naval Postgraduate School, Monterey, California, March, 
1993. 

Boyne, Richard D. and Demuijian, Steven A. and Hsiao, David K. and Kor, Douglas S. 
and Orooji, Ali, “The Implementation of a Multi-Backend Daubase Syetem (MDBS); 
Part m,” Technical Report, Naval Postgraduate School, Monterey, C^omia, March, 
1983. 

De Witt, David J., “DIRECT - A Multiprocessor Organization for Supporting Relational 
Database Management Systems,” IEEE Transactions on Computers, Vol. C-28, No. 6, 
June, 1979, pp. 395 - 406. 

Elmasri, Ramez and Shamkant B. Navathe. Fundamentals of Database Systems . New 
York, The Benjamin/Cummings Publishing Company, Inc., 1989. 

Hall, James E., “Performance Evaluations of a Parallel and Expandable Database 
Computer - The Multi-Backend Database Computer,” M. S. Thesis, Naval 
Postgraguate School, Monterey, California, June, 1989. 

Hammond, Greg Alan, “The Instrumentation of a Parallel, Distributed Database Operation, 
Retrieve-Common, for Merging two large sets of records,” M. S. Thesis, Naval 
Postgraduate School, Monterey, California, June 1992. (HI7526) 

He, Xingui and Higashida, Masanobu and Hsiao, David K. and Kerr, Douglas S. and 
Orooji, Ali and Shi, Zong-Zhi and Strawser, Paula, “The Implementation of a Multi- 
Backend Database System (MDBS): Parts II and III,” Technical Report, Naval 
Postgraduate School, Monterey, California, July 1982. 

Hsiao, David K., “A Parallel, Scalable, Microprocessor-Based Database (Computer for 
Performance Gains and Capacity Growth,” IEEE Micro, December 1991, pp. 44-60. 

Hsiao, David K., “Federated Databases and Systems: Pan I - A Tutorial on Their Data 
Sharing,” Very Large Database (VLDB) Journal, v«>l 1, no 1,1992, pp. 127-179. 

Hsiao, David K., “Federated Databases and Systems: Pan D - A Tutorial on Their Resource 
Consolidation,” Very Large Database (VLDB) Journal, vol 1, no 2,1992, pp. 285-310. 

Hsiao, David K. and Kamel, Magdi N., “Heterogeneous Databases: Proliferations, Issues, 
and Solutions,” IEEE Transactions of Knowledge and Data Engineering, vol 1, no 1, 
March 1989, pp. 45-62. 


84 



KJoepping, Gary R. and Mack, John F. Mack, “The E)esign and Implementation of a 
Relational Interface for the Multi-Lingual Database System,” M. S. Thesis, Naval 
Postgraduate School, Monterey, California, June 1985. (K587163) 

Leffler, Sam and Fabry, Robert S. and Joy, William N. and Lapsley, Phil, “An Advanced 
4.3BSD Interprocess Communication Tutorial,” Integrated Solutions UNIX 
Programmers Supplementary Documents, July, 1987. 

Little, Oaig W., “The Design and Implementation of Pedagogical Software for the Multi- 
Backend/Multi-Lingual Database System,” M. S. Thesis, Naval Postgraduate School, 
Monterey, California, December 1987. (L692) 

Meeks, Andrew P., “The Instrunwntation of the Muldbackend Database System,” M. S. 
Thesis, Naval Postgraduate School, Monterey, Oilifomia, June 1993. 

Kemighan, Brian W. and Ritchie, Dennis M., The C Programming Language (Second 
Edition) . Englewood Qiffs, New Jersey, Prentice-Hall, Inc., 1988. 

Que Corporation, Using UNIX . Carmel, Indiana, Que Corporation, 1990. 

Rieken, Bill and Weiman, Lyle, Adventures in UNIX Network Applications Programming. 
New York, New York, John Wiley and Sons, Inc., 1992. 

Rochkind, Marc J., Advanced UNIX Programming . Englewood Cliffs, New Jersey, 
Prentice-Hall, Inc., 1985. 

Rosch, Winn L., The Winn Rosch Hardware Bible. New York, New York, Simon & 
Schuster, Inc., 1989. 

Rosen, Kenneth H. and Rosinski, Richard R. and Farber, James M., Unix. System V 
Release 4. An Introduction . Berkeley, California, Osborne McGraw-Hill, 1990. 

Stevens, Richard W., Advanced Programming in the UNIX Environment Reading, 
Massachusetts, Addison-Wesley Publishing Company. Inc., 1992. 

United States House of Representatives, “DoD Automated Information Systems 
Experience Runaway Costs and Years of Schedule Delays While Providing Little 
Capability,” Report 101-382, November 1989. 

Wong, Albert, “Toward Highly Portable Database Systems: Issues and Solutions,” M. S. 
Thesis, Naval Postgraduate School, Monterey, California, June 1986. (W755) 

Zawis, John A., “Accessing Hierarchical Databases via SQL Transactions in a Multi- 
Modal Database System,” M. S. Thesis, Naval Postgraguate School, Monterey, 
California, December, 1987. 


85 



INITIAL DISTRIBUTION LIST 


1. Defense Technical Infonnation Center 
^Cameron Station 

Alexandria, VA 22304-6145 

2. Dudley Knox Library 
Code 52 

Naval Postgraduate School 
Monterey, CA 93943-5002 

3. Commandant of the Marine Corps 
Code TE 06 

Headquarters, U.S. Marine Corps 
Washington, D.C. 20380-0001 

4. Ms. Doris Mlezko 
Code P22305 
NAWCWPNS 

Point Mugu.CA 93042-5001 

5. Qiairman, Code CS 
Computer Science Department 
Naval Postgraduate School 
Monterey, CA 93943 

6. Professor David K. Hsiao, Code CS/Hq 
Computer Science Department 

Nav^ Postgraduate School 
Monterey, CA 93943 

7. Ronald J. Roland 
500 Sloat Avenue 
Monterey, CA 93940 

8. Major Stanley H. Watkins, USMC 
6701 Abbey Road 
Bartlesville, OK 74006 


86 



